Llama-3.2-3B-Instruct-f32-GGUF

Llama 3.2 3B Instruct by Meta is a lightweight, instruction-tuned large language model with 3.21 billion parameters, designed for efficient text-only, multilingual dialogue applications such as agentic retrieval, summarization, and instruction following. It employs an optimized transformer architecture with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) to align with human-like helpfulness and safety. Boasting an impressive context length capacity of up to 128,000 tokens, it supports multiple languages including English, German, French, Italian, Portuguese, Hindi, Spanish, and Thai.

The model is tailored for on-device and edge deployment, enabling fast, private, and low-latency inference while maintaining high-quality performance on benchmarks, making it suitable for mobile and resource-constrained environments. Llama 3.2 3B Instruct excels in text generation, rewriting, and dialogue tasks, and represents a robust solution for building advanced AI assistants with strong multilingual and instruction-following capabilities.

Model Files

Model File name	Size	QuantType
Llama-3.2-3B-Instruct.BF16.gguf	6.43 GB	BF16
Llama-3.2-3B-Instruct.F16.gguf	6.43 GB	F16
Llama-3.2-3B-Instruct.F32.gguf	12.9 GB	F32
Llama-3.2-3B-Instruct.Q2_K.gguf	1.36 GB	Q2_K
Llama-3.2-3B-Instruct.Q3_K_L.gguf	1.82 GB	Q3_K_L
Llama-3.2-3B-Instruct.Q3_K_M.gguf	1.69 GB	Q3_K_M
Llama-3.2-3B-Instruct.Q3_K_S.gguf	1.54 GB	Q3_K_S
Llama-3.2-3B-Instruct.Q4_K_M.gguf	2.02 GB	Q4_K_M
Llama-3.2-3B-Instruct.Q4_K_S.gguf	1.93 GB	Q4_K_S
Llama-3.2-3B-Instruct.Q5_K_M.gguf	2.32 GB	Q5_K_M
Llama-3.2-3B-Instruct.Q5_K_S.gguf	2.27 GB	Q5_K_S
Llama-3.2-3B-Instruct.Q6_K.gguf	2.64 GB	Q6_K
Llama-3.2-3B-Instruct.Q8_0.gguf	3.42 GB	Q8_0

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

Downloads last month: 164

GGUF

Model size

3.21B params

Architecture

llama

Hardware compatibility

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Model tree for prithivMLmods/Llama-3.2-3B-Instruct-f32-GGUF

Base model

meta-llama/Llama-3.2-3B-Instruct

Quantized

(382)

this model