4-bit
Hi @btbtyler09 , Thanks for uploading this 8-bit version. I've been looking for a GPTQ/AWQ version of this model to run with SGlang. Any chance you can upload a 4-bit version (either GPTQ/AWQ)?
Thank you!
I will try that tonight. I'm currently trying to make sure this one works ok with vLLM on my machine. Check back sometime tomorrow. Thanks!
I actually can't seem to get this model to run on my machine, so I probably won't make the 4-bit version yet. It may be a problem with my own setup and not the model, but i'd rather not upload another one until I can verify it's working correctly.
A couple days back I tried https://www.modelscope.cn/models/swift/Qwen3-30B-A3B-AWQ but the model produced gibberish (sequence of punctuations). Perhaps it has something to do with the model when quantized.
Someone just copied that same model over to huggingface - https://huggingface.co/cognitivecomputations/Qwen3-30B-A3B-AWQ
Regarding the GPTQ quantization for MoE, it seems like the model is missing packed_modules_mapping function. I think this is related to vllm which did not define the function yet. Related: https://github.com/vllm-project/vllm/issues/17337