Any one can run this model with SGlang framework?
#13
by
muziyongshixin
- opened
I try to run this model with SGlang, but it is extremely slow. Does anyone have a good setting to run this model with SGlang?
Try run this with vLLM, it is much faster.
you can try this command python3 -m sglang.launch_server --host 0.0.0.0 --port 30000 --model-path models/DeepSeek-R1-AWQ --tp 8 --enable-p2p-check --trust-remote-code --dtype float16 --mem-fraction-static 0.95 --served-model-name deepseek-r1-awq --disable-cuda-graph
with sglang==0.4.2
but the result is not as expected..I get empty content for some queries and the think
is not complete