Jan-v1-AIO-GGUF

Jan-v1-4B is a 4-billion-parameter language model built on the Qwen3-4B-thinking architecture, meticulously fine-tuned for agentic reasoning, problem-solving, and tool utilization with support for web search tasks and large context lengths up to 256,000 tokens. Achieving 91.1% accuracy on the SimpleQA benchmark, Jan-v1-4B excels at factual question answering and conversation while running efficiently on local hardware for enhanced privacy and offline use, making it a strong choice for advanced Q&A, reasoning, and integration with the Jan desktop application or compatible inference engines. Jan-v1-edge is a lightweight agentic model built for fast, reliable on-device execution. As the second release in the Jan Family, it is distilled from the larger Jan-v1 model, preserving strong reasoning and problem-solving ability in a smaller footprint suitable for resource-constrained environments. Jan-v1-edge was developed through a two-phase post-training process. The first phase, Supervised Fine-Tuning (SFT), transferred core capabilities from the Jan-v1 teacher model to the smaller student. The second phase, Reinforcement Learning with Verifiable Rewards (RLVR) —the same method used in Jan-v1 and Lucy—further optimized reasoning efficiency, tool use, and correctness. This staged approach delivers reliable results on complex, interactive workloads.

Jan-v1 GGUF Models

Model Name	Hugging Face Link
Jan-v1-edge-GGUF	🔗 Link
Jan-v1-4B-GGUF	🔗 Link

Model Files

Jan-v1-edge

File Name	Quant Type	File Size
Jan-v1-edge.BF16.gguf	BF16	3.45 GB
Jan-v1-edge.F16.gguf	F16	3.45 GB
Jan-v1-edge.F32.gguf	F32	6.89 GB
Jan-v1-edge.Q2_K.gguf	Q2_K	778 MB
Jan-v1-edge.Q3_K_L.gguf	Q3_K_L	1 GB
Jan-v1-edge.Q3_K_M.gguf	Q3_K_M	940 MB
Jan-v1-edge.Q3_K_S.gguf	Q3_K_S	867 MB
Jan-v1-edge.Q4_0.gguf	Q4_0	1.05 GB
Jan-v1-edge.Q4_1.gguf	Q4_1	1.14 GB
Jan-v1-edge.Q4_K.gguf	Q4_K	1.11 GB
Jan-v1-edge.Q4_K_M.gguf	Q4_K_M	1.11 GB
Jan-v1-edge.Q4_K_S.gguf	Q4_K_S	1.06 GB
Jan-v1-edge.Q5_0.gguf	Q5_0	1.23 GB
Jan-v1-edge.Q5_1.gguf	Q5_1	1.32 GB
Jan-v1-edge.Q5_K.gguf	Q5_K	1.26 GB
Jan-v1-edge.Q5_K_M.gguf	Q5_K_M	1.26 GB
Jan-v1-edge.Q5_K_S.gguf	Q5_K_S	1.23 GB
Jan-v1-edge.Q6_K.gguf	Q6_K	1.42 GB
Jan-v1-edge.Q8_0.gguf	Q8_0	1.83 GB

Jan-v1-4B

File Name	Quant Type	File Size
Jan-v1-4B.BF16.gguf	BF16	8.05 GB
Jan-v1-4B.F16.gguf	F16	8.05 GB
Jan-v1-4B.F32.gguf	F32	16.1 GB
Jan-v1-4B.Q2_K.gguf	Q2_K	1.67 GB
Jan-v1-4B.Q3_K_L.gguf	Q3_K_L	2.24 GB
Jan-v1-4B.Q3_K_M.gguf	Q3_K_M	2.08 GB
Jan-v1-4B.Q3_K_S.gguf	Q3_K_S	1.89 GB
Jan-v1-4B.Q4_K_M.gguf	Q4_K_M	2.5 GB
Jan-v1-4B.Q4_K_S.gguf	Q4_K_S	2.38 GB
Jan-v1-4B.Q5_K_M.gguf	Q5_K_M	2.89 GB
Jan-v1-4B.Q5_K_S.gguf	Q5_K_S	2.82 GB
Jan-v1-4B.Q6_K.gguf	Q6_K	3.31 GB
Jan-v1-4B.Q8_0.gguf	Q8_0	4.28 GB

Quants Usage

(sorted by size, not necessarily quality. IQ-quants are often preferable over similar sized non-IQ quants)

Here is a handy graph by ikawrakow comparing some lower-quality quant types (lower is better):