ik_llama.cpp quantizations of DeepSeek-TNG-R1T2-Chimera

Quantized using ik_llama.cpp build = 3788 (4622fadc)

NOTE: These quants MUST be run using the llama.cpp fork, ik_llama.cpp

Credits to @ubergarm for his DeepSeek quant recipes for which these quants were based on.

Please check out his repo for smaller quants with imatrix: https://huggingface.co/ubergarm/DeepSeek-TNG-R1T2-Chimera-GGUF

name file size quant type bpw
DeepSeek-TNG-R1T2-Chimera-IQ4_XS_R8 340.764 GiB IQ4_XS_R8 (97.5%) / Q8_0 (2.5%) 4.362
DeepSeek-TNG-R1T2-Chimera-D-IQ4_KS_R4 366.762 GiB IQ4_KS_R4 (65%) / IQ5_KS_R4 (32.5%) / Q8_0 (2.5%) 4.695
DeepSeek-TNG-R1T2-Chimera-D-Q4_K_R4 412.131 GiB Q4_K_R4 (65%) / Q6_K_R4 (32.5%) / Q8_0 (2.5%) 5.276
Downloads last month
221
GGUF
Model size
671B params
Architecture
deepseek2
Hardware compatibility
Log In to view the estimation

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kebob/DeepSeek-TNG-R1T2-Chimera-IK_GGUF

Quantized
(7)
this model