Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

1,024

Full-text search

Active filters: vllm

RedHatAI/Llama-3.2-90B-Vision-Instruct-FP8-dynamic

Text Generation • 89B • Updated Oct 2, 2024 • 2.76k • 10

soprasteria/Mixtral-8x7B-Instruct-v0.1-FP8

47B • Updated Sep 27, 2024 • 2

RedHatAI/Phi-3.5-mini-instruct-FP8-KV

Text Generation • 4B • Updated Oct 1, 2024 • 5 • 2

RedHatAI/Qwen2.5-0.5B-quantized.w8a16

Text Generation • 0.4B • Updated Nov 26, 2024 • 7

RedHatAI/Qwen2.5-1.5B-quantized.w8a16

Text Generation • 0.8B • Updated Nov 26, 2024 • 8

RedHatAI/Qwen2.5-3B-quantized.w8a16

Text Generation • 1B • Updated Nov 26, 2024 • 10

RedHatAI/Qwen2.5-7B-quantized.w8a16

Text Generation • 3B • Updated Nov 26, 2024 • 14 • 1

RedHatAI/Qwen2.5-32B-quantized.w8a16

Text Generation • 9B • Updated Nov 26, 2024 • 6

RedHatAI/Qwen2.5-72B-quantized.w8a16

Text Generation • 20B • Updated Nov 26, 2024 • 5

RedHatAI/pixtral-12b-FP8-dynamic

Text Generation • 13B • Updated Feb 7 • 8.55k • 10

mlx-community/Ministral-8B-Instruct-2410-bf16

8B • Updated Oct 17, 2024 • 23 • 2

mlx-community/Ministral-8B-Instruct-2410-4bit

1B • Updated Oct 17, 2024 • 189 • 9

mlx-community/Ministral-8B-Instruct-2410-8bit

2B • Updated Oct 17, 2024 • 20 • 2

RedHatAI/Llama-3.1-Nemotron-70B-Instruct-HF-FP8-dynamic

Text Generation • 71B • Updated May 30 • 1.05k • 14

TouchNight/Ministral-8B-Instruct-2410-HF

8B • Updated Oct 18, 2024 • 13

TouchNight/Ministral-8B-Instruct-2410-HF-Q5_K_M-GGUF

8B • Updated Oct 18, 2024 • 2

ijohn07/Ministral-8B-Instruct-2410-HF-Q8_0-GGUF

8B • Updated Oct 19, 2024 • 8

adriabama06/reader-lm-1.5b-AWQ

Text Generation • 0.4B • Updated Nov 1, 2024 • 4 • 1

sasha0552/Ministral-8B-Instruct-2410

Updated Oct 20, 2024 • 2

aashish1904/Ministral-8B-Instruct-2410-HF-Q4_K_M-GGUF

8B • Updated Oct 20, 2024 • 12 • 1

QuantFactory/TouchNight-Ministral-8B-Instruct-2410-HF-GGUF

8B • Updated Oct 20, 2024 • 23 • 2

aashish1904/Ministral-8B-Instruct-2410-HF-Q2_K-GGUF

8B • Updated Oct 20, 2024 • 5 • 2

GrimsenClory/Ministral-8B-Instruct-2410-Q6_K-GGUF

8B • Updated Oct 21, 2024 • 10

QuantFactory/Ministral-8B-Instruct-2410-GGUF

8B • Updated Oct 22, 2024 • 111 • 2

gphorvath/Ministral-8B-Instruct-2410-Q4_K_M-GGUF

8B • Updated Oct 26, 2024 • 4

Gleisson1/Ministral-8B-Instruct-2410-HF-4bit

5B • Updated Oct 26, 2024 • 4

paultimothymooney/Ministral-8B-Instruct-2410-Q8_0-GGUF

8B • Updated Oct 28, 2024 • 2

paultimothymooney/Ministral-8B-Instruct-2410-Q4_K_M-GGUF

8B • Updated Oct 28, 2024 • 3

LouiSeHU/Mistral-Small-Instruct-2409-Q4_0-GGUF

22B • Updated Oct 29, 2024 • 2

yejingfu/nmagic-Meta-Llama-3.1-8B-Instruct-FP8

Text Generation • 8B • Updated Oct 31, 2024 • 5.59k