Edit Models filters

Inference status

Misc

arxiv: 2210.17323

AutoTrain Compatible

text-generation-inference

Inference Endpoints

4-bit precision

8-bit precision

Carbon Emissions

Misc with no match

text-embeddings-inference

Mixture of Experts

Models

114

Full-text search

Active filters: 2210.17323

neuralmagic/Phi-3-medium-128k-instruct-quantized.w8a16

Text Generation • Updated Oct 9 • 6.77k • 2

neuralmagic/Llama-2-7b-chat-quantized.w8a8

Text Generation • Updated Oct 9 • 599 • 1

neuralmagic/Meta-Llama-3-8B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 528 • 2

neuralmagic/Phi-3-mini-128k-instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 576

neuralmagic/Phi-3-medium-128k-instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 1.74k • 2

neuralmagic/Qwen2-1.5B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 1.67k

neuralmagic/Phi-3-mini-128k-instruct-quantized.w4a16

Text Generation • Updated Oct 9 • 848 • 1

neuralmagic/Qwen2-0.5B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 40

neuralmagic/Phi-3-medium-128k-instruct-quantized.w4a16

Text Generation • Updated Oct 9 • 8.49k • 3

neuralmagic/Qwen2-7B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 703

neuralmagic/Meta-Llama-3-70B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 31

neuralmagic/Qwen2-72B-Instruct-quantized.w8a8

Text Generation • Updated Oct 9 • 446 • 1

rinna/llama-3-youko-8b-gptq

Text Generation • Updated Aug 31 • 47

rinna/llama-3-youko-70b-gptq

Text Generation • Updated Aug 31 • 25

rinna/llama-3-youko-70b-instruct-gptq

Text Generation • Updated Aug 31 • 71

neuralmagic/Meta-Llama-3.1-8B-quantized.w8a16

Text Generation • Updated Oct 9 • 429 • 1

neuralmagic/Meta-Llama-3.1-8B-quantized.w8a8

Text Generation • Updated 25 days ago • 640 • 1

neuralmagic/starcoder2-15b-quantized.w8a16

Text Generation • Updated Oct 9 • 433

neuralmagic/starcoder2-3b-quantized.w8a16

Text Generation • Updated Oct 9 • 30

neuralmagic/starcoder2-7b-quantized.w8a16

Text Generation • Updated Oct 9 • 17

neuralmagic/starcoder2-3b-quantized.w8a8

Text Generation • Updated Oct 9 • 18

neuralmagic/starcoder2-7b-quantized.w8a8

Text Generation • Updated Oct 9 • 30

neuralmagic/starcoder2-15b-quantized.w8a8

Text Generation • Updated Oct 9 • 19

onyrotssih/Meta-Llama-3.1-8B-Instruct-quantized.w8a16

Text Generation • Updated Aug 5 • 10

neuralmagic/gemma-2-9b-it-quantized.w8a16

Text Generation • Updated Oct 9 • 2.98k • 1

neuralmagic/gemma-2-2b-it-quantized.w8a16

Text Generation • Updated Oct 9 • 57 • 1

neuralmagic/gemma-2-2b-quantized.w8a16

Text Generation • Updated Oct 9 • 52

neuralmagic/Phi-3-small-128k-instruct-quantized.w8a16

Text Generation • Updated Oct 9 • 574

neuralmagic/gemma-2-2b-it-quantized.w4a16

Text Generation • Updated Oct 9 • 689

neuralmagic/SmolLM-1.7B-Instruct-quantized.w8a16

Text Generation • Updated Oct 9 • 42