IQ4_NL is generating gibberish on llama.cpp

#2
by netroy - opened

on IQ4_NL latest llama.cpp is generating complete gibberish on CUDA, on VULKAN, and on CPU-only as well.

image.png

Unsloth AI org

on IQ4_NL latest llama.cpp is generating complete gibberish on CUDA, on VULKAN, and on CPU-only as well.

image.png

Does this happen for Q8 as well for you? I tried IQ4_NL and it works fine for me

Unsloth AI org

@netroy I tried on CUDA again via ./llama.cpp/llama-cli -hf unsloth/Qwen3-30B-A3B-Thinking-2507-GGUF:IQ4_NL -ngl 99 --jinja and it works fine - see screenshot below:

image.png

Please try redownloading the model weights as well

Sign up or log in to comment