Broken inference on vllm

by Eryk-Chmielewski - opened 18 days ago

18 days ago

I don't know why but inference on vllm with Qwen-4B-Instruct-2507 is OK.
But this quantization gives only: !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
vllm version 10.0.1.1

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment