Jan-v1-AIO-GGUF

Jan-v1-4B is a 4-billion-parameter language model built on the Qwen3-4B-thinking architecture, meticulously fine-tuned for agentic reasoning, problem-solving, and tool utilization with support for web search tasks and large context lengths up to 256,000 tokens. Achieving 91.1% accuracy on the SimpleQA benchmark, Jan-v1-4B excels at factual question answering and conversation while running efficiently on local hardware for enhanced privacy and offline use, making it a strong choice for advanced Q&A, reasoning, and integration with the Jan desktop application or compatible inference engines. Jan-v1-edge is a lightweight agentic model built for fast, reliable on-device execution. As the second release in the Jan Family, it is distilled from the larger Jan-v1 model, preserving strong reasoning and problem-solving ability in a smaller footprint suitable for resource-constrained environments. Jan-v1-edge was developed through a two-phase post-training process. The first phase, Supervised Fine-Tuning (SFT), transferred core capabilities from the Jan-v1 teacher model to the smaller student. The second phase, Reinforcement Learning with Verifiable Rewards (RLVR) —the same method used in Jan-v1 and Lucy—further optimized reasoning efficiency, tool use, and correctness. This staged approach delivers reliable results on complex, interactive workloads.

Jan-v1 GGUF Models

Model Name Hugging Face Link
Jan-v1-edge-GGUF 🔗 Link
Jan-v1-4B-GGUF 🔗 Link

Model Files

Jan-v1-edge

File Name Quant Type File Size
Jan-v1-edge.BF16.gguf BF16 3.45 GB
Jan-v1-edge.F16.gguf F16 3.45 GB
Jan-v1-edge.F32.gguf F32 6.89 GB
Jan-v1-edge.Q2_K.gguf Q2_K 778 MB
Jan-v1-edge.Q3_K_L.gguf Q3_K_L 1 GB
Jan-v1-edge.Q3_K_M.gguf Q3_K_M 940 MB
Jan-v1-edge.Q3_K_S.gguf Q3_K_S 867 MB
Jan-v1-edge.Q4_0.gguf Q4_0 1.05 GB
Jan-v1-edge.Q4_1.gguf Q4_1 1.14 GB
Jan-v1-edge.Q4_K.gguf Q4_K 1.11 GB
Jan-v1-edge.Q4_K_M.gguf Q4_K_M 1.11 GB
Jan-v1-edge.Q4_K_S.gguf Q4_K_S 1.06 GB
Jan-v1-edge.Q5_0.gguf Q5_0 1.23 GB
Jan-v1-edge.Q5_1.gguf Q5_1 1.32 GB
Jan-v1-edge.Q5_K.gguf Q5_K 1.26 GB
Jan-v1-edge.Q5_K_M.gguf Q5_K_M 1.26 GB
Jan-v1-edge.Q5_K_S.gguf Q5_K_S 1.23 GB
Jan-v1-edge.Q6_K.gguf Q6_K 1.42 GB
Jan-v1-edge.Q8_0.gguf Q8_0 1.83 GB

Jan-v1-4B

File Name Quant Type File Size
Jan-v1-4B.BF16.gguf BF16 8.05 GB
Jan-v1-4B.F16.gguf F16 8.05 GB
Jan-v1-4B.F32.gguf F32 16.1 GB
Jan-v1-4B.Q2_K.gguf Q2_K 1.67 GB
Jan-v1-4B.Q3_K_L.gguf Q3_K_L 2.24 GB
Jan-v1-4B.Q3_K_M.gguf Q3_K_M 2.08 GB
Jan-v1-4B.Q3_K_S.gguf Q3_K_S 1.89 GB
Jan-v1-4B.Q4_K_M.gguf Q4_K_M 2.5 GB
Jan-v1-4B.Q4_K_S.gguf Q4_K_S 2.38 GB
Jan-v1-4B.Q5_K_M.gguf Q5_K_M 2.89 GB
Jan-v1-4B.Q5_K_S.gguf Q5_K_S 2.82 GB
Jan-v1-4B.Q6_K.gguf Q6_K 3.31 GB
Jan-v1-4B.Q8_0.gguf Q8_0 4.28 GB

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):

image.png

Downloads last month
3,080
GGUF
Model size
4.02B params
Architecture
qwen3
Hardware compatibility
Log In to view the estimation

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

32-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for prithivMLmods/Jan-v1-AIO-GGUF

Finetuned
janhq/Jan-v1-4B
Quantized
(36)
this model