Compare to new Dynamic v2.0 Unsloth quants ?
Thank you so much for your quants and the info you provide on how to use ik_llama.cpp !
It seems Unsloth just released new quants :
https://huggingface.co/unsloth/gemma-3-27b-it-GGUF
Would you mind comparing your quants with the new (Dynamic v2.0) Unsloth quants ?
You can compare the model card in the sidebar to see the exact tensor differences for a similar weight quant e.g. their Q4_K_M
I go over kind of how to look at that in more detail where you asked it on the V3-0324 comparison .
In general the quants available on ik_llama.cpp
are better quality with only slightly reduced speed over comparable quants available in mainline. I released these quants specifically to allow folks to run the best quality quants available for 16GB VRAM cards for this model.
Also this one is made specifically to be around 4bpw because it is made from the QAT version, whereas I don't think the exact model you linked is from the QAT version according to the model card.
Though again, I haven't even downloaded the unsloth quants nor checked their perplexity nor run any lm-evaluation-harness type benchmarks.
Feel free to post if you do! Cheers!