Smoothie-Qwen3-4B-F32-GGUF

Smoothie Qwen is a lightweight adjustment tool that smooths token probabilities in Qwen and similar models, enhancing balanced multilingual generation capabilities. For more details, please refer to https://github.com/dnotitia/smoothie-qwen.

Model Files

Filename Size Format Description
Smoothie-Qwen3-4B.BF16.gguf 8.05 GB BF16 Brain Float 16-bit quantization
Smoothie-Qwen3-4B.F16.gguf 8.05 GB F16 Half precision (16-bit) floating point
Smoothie-Qwen3-4B.F32.gguf 16.1 GB F32 Full precision (32-bit) floating point
Smoothie-Qwen3-4B.Q2_K.gguf 1.67 GB Q2_K 2-bit quantization with K-quant
Smoothie-Qwen3-4B.Q3_K_L.gguf 2.24 GB Q3_K_L 3-bit quantization (Large) with K-quant
Smoothie-Qwen3-4B.Q3_K_M.gguf 2.08 GB Q3_K_M 3-bit quantization (Medium) with K-quant
Smoothie-Qwen3-4B.Q3_K_S.gguf 1.89 GB Q3_K_S 3-bit quantization (Small) with K-quant
Smoothie-Qwen3-4B.Q4_K_M.gguf 2.5 GB Q4_K_M 4-bit quantization (Medium) with K-quant
Smoothie-Qwen3-4B.Q4_K_S.gguf 2.38 GB Q4_K_S 4-bit quantization (Small) with K-quant
Smoothie-Qwen3-4B.Q5_K_M.gguf 2.89 GB Q5_K_M 5-bit quantization (Medium) with K-quant
Smoothie-Qwen3-4B.Q5_K_S.gguf 2.82 GB Q5_K_S 5-bit quantization (Small) with K-quant
Smoothie-Qwen3-4B.Q6_K.gguf 3.31 GB Q6_K 6-bit quantization with K-quant
Smoothie-Qwen3-4B.Q8_0.gguf 4.28 GB Q8_0 8-bit quantization

Recommended Usage

  • Q4_K_M or Q5_K_M: Best balance of quality and performance for most users
  • Q6_K or Q8_0: Higher quality, larger file sizes
  • Q2_K or Q3_K_S: Fastest inference, lower quality
  • F16 or BF16: High quality, requires more VRAM
  • F32: Highest quality, requires significant VRAM

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
190
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for prithivMLmods/Smoothie-Qwen3-4B-F32-GGUF

Base model

Qwen/Qwen3-4B-Base
Finetuned
Qwen/Qwen3-4B
Quantized
(3)
this model