bartowski/kalomaze_Qwen3-16B-A3B-GGUF

I just downloaded the Q8 quant on LM Studio, and I have a Jinja template error. Copying the template from another Qwen 3 model fixes it. I am on the latest version of LM Studio (build 0.3.15) and on the beta CUDA runtime (v1.29, llama.cpp b5219) . Is the error specific to LM Studio, or is it the quant itself?

As a side note, can you quant SmolVLM2 and Qwen 2.5 VL? SmolVLM2 should already be working. Regarding Qwen 2.5 VL, from what I see, ggml-org has a few basic quants already, but ngxson has some bugs with the 32B version. However, when I download the mmproj file from ggml-org/Qwen2.5-VL-32B-Instruct-GGUF and combine it with the IQ4_XS text only quant from mradermacher/Qwen2.5-VL-32B-Instruct-i1-GGUF, it seems to work. Not sure how he managed to produce the text only quant about a month ago though.