Qwen3-4B-Esper3-F32-GGUF

Esper 3 is a specialist model built on Qwen 3, designed for coding, architecture, and DevOps reasoning. It has been fine-tuned using our proprietary DevOps, architecture, and code reasoning dataset generated with Deepseek R1. This tuning enhances its general and creative reasoning abilities, making it effective not only in problem-solving but also in general conversation. With its small model sizes, Esper 3 is optimized for fast inference, making it suitable for deployment on local desktops, mobile devices, and high-speed server environments.

Model Files

Filename	Size	Format	Description
Qwen3-4B-Esper3.BF16.gguf	8.05 GB	BF16	Brain Float 16-bit quantization
Qwen3-4B-Esper3.F16.gguf	8.05 GB	F16	Half precision (16-bit) floating point
Qwen3-4B-Esper3.F32.gguf	16.1 GB	F32	Full precision (32-bit) floating point
Qwen3-4B-Esper3.Q2_K.gguf	1.67 GB	Q2_K	2-bit quantization with K-quant
Qwen3-4B-Esper3.Q3_K_L.gguf	2.24 GB	Q3_K_L	3-bit quantization (Large) with K-quant
Qwen3-4B-Esper3.Q3_K_M.gguf	2.08 GB	Q3_K_M	3-bit quantization (Medium) with K-quant
Qwen3-4B-Esper3.Q3_K_S.gguf	1.89 GB	Q3_K_S	3-bit quantization (Small) with K-quant
Qwen3-4B-Esper3.Q4_K_M.gguf	2.5 GB	Q4_K_M	4-bit quantization (Medium) with K-quant
Qwen3-4B-Esper3.Q4_K_S.gguf	2.38 GB	Q4_K_S	4-bit quantization (Small) with K-quant
Qwen3-4B-Esper3.Q5_K_M.gguf	2.89 GB	Q5_K_M	5-bit quantization (Medium) with K-quant
Qwen3-4B-Esper3.Q5_K_S.gguf	2.82 GB	Q5_K_S	5-bit quantization (Small) with K-quant
Qwen3-4B-Esper3.Q6_K.gguf	3.31 GB	Q6_K	6-bit quantization with K-quant
Qwen3-4B-Esper3.Q8_0.gguf	4.28 GB	Q8_0	8-bit quantization

Recommended Usage

Q4_K_M or Q5_K_M: Best balance of quality and performance for most users
Q6_K or Q8_0: Higher quality, larger file sizes
Q2_K or Q3_K_S: Fastest inference, lower quality
F16 or BF16: High quality, requires more VRAM
F32: Highest quality, requires significant VRAM

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

prithivMLmods
/

Qwen3-4B-Esper3-F32-GGUF

Qwen3-4B-Esper3-F32-GGUF

Model Files

Recommended Usage

Quants Usage

Model tree for prithivMLmods/Qwen3-4B-Esper3-F32-GGUF