prithivMLmods
/

Llama-3.2-3B-Instruct-f32-GGUF

@@ -14,4 +14,31 @@ tags:
 > Llama 3.2 3B Instruct by Meta is a lightweight, instruction-tuned large language model with 3.21 billion parameters, designed for efficient text-only, multilingual dialogue applications such as agentic retrieval, summarization, and instruction following. It employs an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to align with human-like helpfulness and safety. Boasting an impressive context length capacity of up to 128,000 tokens, it supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
-> The model is tailored for on-device and edge deployment, enabling fast, private, and low-latency inference while maintaining high-quality performance on benchmarks, making it suitable for mobile and resource-constrained environments. Llama 3.2 3B Instruct excels in text generation, rewriting, and dialogue tasks, and represents a robust solution for building advanced AI assistants with strong multilingual and instruction-following capabilities.

 > Llama 3.2 3B Instruct by Meta is a lightweight, instruction-tuned large language model with 3.21 billion parameters, designed for efficient text-only, multilingual dialogue applications such as agentic retrieval, summarization, and instruction following. It employs an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to align with human-like helpfulness and safety. Boasting an impressive context length capacity of up to 128,000 tokens, it supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
+> The model is tailored for on-device and edge deployment, enabling fast, private, and low-latency inference while maintaining high-quality performance on benchmarks, making it suitable for mobile and resource-constrained environments. Llama 3.2 3B Instruct excels in text generation, rewriting, and dialogue tasks, and represents a robust solution for building advanced AI assistants with strong multilingual and instruction-following capabilities.
+## Model Files
+| Model File name | Size | QuantType |
+|---|---|---|
+| Llama-3.2-3B-Instruct.BF16.gguf | 6.43 GB | BF16 |
+| Llama-3.2-3B-Instruct.F16.gguf | 6.43 GB | F16 |
+| Llama-3.2-3B-Instruct.F32.gguf | 12.9 GB | F32 |
+| Llama-3.2-3B-Instruct.Q2_K.gguf | 1.36 GB | Q2_K |
+| Llama-3.2-3B-Instruct.Q3_K_L.gguf | 1.82 GB | Q3_K_L |
+| Llama-3.2-3B-Instruct.Q3_K_M.gguf | 1.69 GB | Q3_K_M |
+| Llama-3.2-3B-Instruct.Q3_K_S.gguf | 1.54 GB | Q3_K_S |
+| Llama-3.2-3B-Instruct.Q4_K_M.gguf | 2.02 GB | Q4_K_M |
+| Llama-3.2-3B-Instruct.Q4_K_S.gguf | 1.93 GB | Q4_K_S |
+| Llama-3.2-3B-Instruct.Q5_K_M.gguf | 2.32 GB | Q5_K_M |
+| Llama-3.2-3B-Instruct.Q5_K_S.gguf | 2.27 GB | Q5_K_S |
+| Llama-3.2-3B-Instruct.Q6_K.gguf | 2.64 GB | Q6_K |
+| Llama-3.2-3B-Instruct.Q8_0.gguf | 3.42 GB | Q8_0 |
+## Quants Usage
+(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
+Here is a handy graph by ikawrakow comparing some lower-quality quant
+types (lower is better):
+![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)