about the "model_max_length": 16384

#11

by AlexWuKing - opened Mar 11

Mar 11

the original model_max_length of the Qwen/Qwen2.5-7B-Instruct is 131072
but in this distill model deepseek-ai/DeepSeek-R1-Distill-Qwen-7B, it is set to 16384

i wonder why we are doing

this?

MatchaLwc

Jul 2

Same doubt

1llumi

Jul 8

same issue. For vllm in the main page i found the following example:
vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-32B --tensor-parallel-size 2 --max-model-len 32768 --enforce-eager, where max_length is 32768

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment