Update README.md
Browse files
README.md
CHANGED
@@ -14,4 +14,31 @@ tags:
|
|
14 |
|
15 |
> Llama 3.2 3B Instruct by Meta is a lightweight, instruction-tuned large language model with 3.21 billion parameters, designed for efficient text-only, multilingual dialogue applications such as agentic retrieval, summarization, and instruction following. It employs an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to align with human-like helpfulness and safety. Boasting an impressive context length capacity of up to 128,000 tokens, it supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
|
16 |
|
17 |
-
> The model is tailored for on-device and edge deployment, enabling fast, private, and low-latency inference while maintaining high-quality performance on benchmarks, making it suitable for mobile and resource-constrained environments. Llama 3.2 3B Instruct excels in text generation, rewriting, and dialogue tasks, and represents a robust solution for building advanced AI assistants with strong multilingual and instruction-following capabilities.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
14 |
|
15 |
> Llama 3.2 3B Instruct by Meta is a lightweight, instruction-tuned large language model with 3.21 billion parameters, designed for efficient text-only, multilingual dialogue applications such as agentic retrieval, summarization, and instruction following. It employs an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to align with human-like helpfulness and safety. Boasting an impressive context length capacity of up to 128,000 tokens, it supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.
|
16 |
|
17 |
+
> The model is tailored for on-device and edge deployment, enabling fast, private, and low-latency inference while maintaining high-quality performance on benchmarks, making it suitable for mobile and resource-constrained environments. Llama 3.2 3B Instruct excels in text generation, rewriting, and dialogue tasks, and represents a robust solution for building advanced AI assistants with strong multilingual and instruction-following capabilities.
|
18 |
+
|
19 |
+
## Model Files
|
20 |
+
|
21 |
+
| Model File name | Size | QuantType |
|
22 |
+
|---|---|---|
|
23 |
+
| Llama-3.2-3B-Instruct.BF16.gguf | 6.43 GB | BF16 |
|
24 |
+
| Llama-3.2-3B-Instruct.F16.gguf | 6.43 GB | F16 |
|
25 |
+
| Llama-3.2-3B-Instruct.F32.gguf | 12.9 GB | F32 |
|
26 |
+
| Llama-3.2-3B-Instruct.Q2_K.gguf | 1.36 GB | Q2_K |
|
27 |
+
| Llama-3.2-3B-Instruct.Q3_K_L.gguf | 1.82 GB | Q3_K_L |
|
28 |
+
| Llama-3.2-3B-Instruct.Q3_K_M.gguf | 1.69 GB | Q3_K_M |
|
29 |
+
| Llama-3.2-3B-Instruct.Q3_K_S.gguf | 1.54 GB | Q3_K_S |
|
30 |
+
| Llama-3.2-3B-Instruct.Q4_K_M.gguf | 2.02 GB | Q4_K_M |
|
31 |
+
| Llama-3.2-3B-Instruct.Q4_K_S.gguf | 1.93 GB | Q4_K_S |
|
32 |
+
| Llama-3.2-3B-Instruct.Q5_K_M.gguf | 2.32 GB | Q5_K_M |
|
33 |
+
| Llama-3.2-3B-Instruct.Q5_K_S.gguf | 2.27 GB | Q5_K_S |
|
34 |
+
| Llama-3.2-3B-Instruct.Q6_K.gguf | 2.64 GB | Q6_K |
|
35 |
+
| Llama-3.2-3B-Instruct.Q8_0.gguf | 3.42 GB | Q8_0 |
|
36 |
+
|
37 |
+
## Quants Usage
|
38 |
+
|
39 |
+
(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)
|
40 |
+
|
41 |
+
Here is a handy graph by ikawrakow comparing some lower-quality quant
|
42 |
+
types (lower is better):
|
43 |
+
|
44 |
+

|