Broken inference on vllm
#2
by
Eryk-Chmielewski
- opened
I don't know why but inference on vllm with Qwen-4B-Instruct-2507 is OK.
But this quantization gives only: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
vllm version 10.0.1.1