Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
AI & ML interests
Optimised quants for high-throughput deployments! Compatible with Transformers, TGI & vLLM 🤗
Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗
-
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
Text Generation • 59B • Updated • 1.57k • 36 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation • 214B • Updated • 10 • 5 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
Text Generation • 59B • Updated • 1.19k • 16 -
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation • 11B • Updated • 440k • 105
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation • 3B • Updated • 5.54k • 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation • 3B • Updated • 8.93k • 20 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation • 1B • Updated • 21.4k • 34 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 45.8k • 17
Optimised AWQ Quants for high-throughput deployments of Gemma2! Compatible with Transformers, TGI & VLLM 🤗
Llama.cpp compatible quants for Llama 3.2 3B and 1B Instruct models.
-
hugging-quants/Llama-3.2-3B-Instruct-Q8_0-GGUF
Text Generation • 3B • Updated • 5.54k • 52 -
hugging-quants/Llama-3.2-3B-Instruct-Q4_K_M-GGUF
Text Generation • 3B • Updated • 8.93k • 20 -
hugging-quants/Llama-3.2-1B-Instruct-Q8_0-GGUF
Text Generation • 1B • Updated • 21.4k • 34 -
hugging-quants/Llama-3.2-1B-Instruct-Q4_K_M-GGUF
Text Generation • 1B • Updated • 45.8k • 17
Optimised Quants for high-throughput deployments! Compatible with Transformers, TGI & VLLM 🤗
-
hugging-quants/Meta-Llama-3.1-405B-Instruct-AWQ-INT4
Text Generation • 59B • Updated • 1.57k • 36 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-BNB-NF4
Text Generation • 214B • Updated • 10 • 5 -
hugging-quants/Meta-Llama-3.1-405B-Instruct-GPTQ-INT4
Text Generation • 59B • Updated • 1.19k • 16 -
hugging-quants/Meta-Llama-3.1-70B-Instruct-AWQ-INT4
Text Generation • 11B • Updated • 440k • 105