Edit Models filters

Apps

Docker Model Runner

Inference Providers

HF Inference API

Misc

Inference Endpoints

text-generation-inference

4-bit precision

8-bit precision

text-embeddings-inference

Carbon Emissions

Mixture of Experts

Models

1,038

Full-text search

Active filters: vllm

ArtusDev/mistralai_Devstral-Small-2505_EXL3_6.0bpw_H8

Text Generation • 9B • Updated May 21 • 8

ArtusDev/mistralai_Devstral-Small-2505_EXL3_6.5bpw_H8

Text Generation • 10B • Updated May 21 • 9

ArtusDev/mistralai_Devstral-Small-2505_EXL3_8.0bpw_H8

Text Generation • 12B • Updated May 21 • 8 • 1

nm-testing/Devstral-Small-2505-FP8-dynamic

Text Generation • 24B • Updated May 21 • 178 • 1

async0x42/Devstral-Small-2505-exl3_4.0bpw

Text Generation • 6B • Updated May 21 • 7

async0x42/Devstral-Small-2505-exl3_4.5bpw

Text Generation • 7B • Updated May 21 • 10

casperhansen/Mistral-Small-3.1-24B-Base-2503-Text-Only

24B • Updated May 22 • 10

Antigma/Devstral-Small-2505-GGUF

Text Generation • 24B • Updated May 29 • 24 • 1

shmdtalha/Mistral-Small-3.1-24B-Instruct-2503

Image-Text-to-Text • Updated May 24 • 5

textgeflecht/Devstral-Small-2505-FP8-llmcompressor

Text Generation • 24B • Updated May 25 • 96

mratsim/Devstral-Small-2505.w4a16-gptq

Text Generation • 4B • Updated May 26 • 70 • 2

andtt/Ministral-8B-Instruct-2410-Q3_K_S-GGUF

8B • Updated May 25

andtt/Ministral-8B-Instruct-2410-Q3_K_M-GGUF

8B • Updated May 25 • 1

andtt/Ministral-8B-Instruct-2410-Q3_K_L-GGUF

8B • Updated May 25

Sertipan/Devstral-Small-2505

Text Generation • 24B • Updated May 27 • 7

huihui-ai/Devstral-Small-2505-abliterated

Text Generation • 24B • Updated Jun 9 • 46 • 7

geninhu/RakutenAI-7B-instruct-GPTQ

Updated May 30 • 4

RedHatAI/DeepSeek-R1-0528-quantized.w4a16

Text Generation • Updated Jun 2 • 1.41k • 9

Mungert/Devstral-Small-2505-GGUF

Text Generation • 24B • Updated 4 days ago • 1.42k • 6

th-nuernberg/DeepHermes-3-Mistral-24B-Preview-FP8-Dynamic

Text Generation • 24B • Updated Jun 3 • 7

Fulstac/deepseek-r1-Distill-Qwen-32B-sqlgen-4bit-v1

Text Generation • 33B • Updated Jun 6 • 5

Fulstac/deepseek-r1-Distill-Qwen-32B-lora-4bit-v3

Text Generation • 33B • Updated Jun 6 • 4

RedHatAI/gemma-3-4b-it-quantized.w8a8

Image-Text-to-Text • 5B • Updated Jun 9 • 190

RedHatAI/gemma-3-4b-it-quantized.w4a16

Image-Text-to-Text • 2B • Updated Jun 9 • 28.2k • 2

RedHatAI/gemma-3-12b-it-quantized.w8a8

Image-Text-to-Text • 13B • Updated Jun 9 • 2.08k • 2

RedHatAI/gemma-3-12b-it-quantized.w4a16

Image-Text-to-Text • 4B • Updated Jun 9 • 746

RedHatAI/gemma-3-27b-it-quantized.w8a8

Image-Text-to-Text • 29B • Updated Jun 9 • 11.5k • 7

RedHatAI/gemma-3-1b-it-quantized.w8a8

Text Generation • 1B • Updated Jun 6 • 1.54k

RedHatAI/gemma-3-1b-it-quantized.w4a16

Text Generation • 0.7B • Updated Jun 6 • 2.53k

bullerwins/Magistral-Small-2506-fp8

24B • Updated Jun 10 • 78