Update README.md
Browse filessmall repo update in the vllm command
README.md
CHANGED
@@ -26,7 +26,7 @@ otherwise the expert tensors couldn’t be evenly sharded across GPU devices.</i
|
|
26 |
```
|
27 |
CONTEXT_LENGTH=32768
|
28 |
vllm serve \
|
29 |
-
|
30 |
--served-model-name My_Model \
|
31 |
--enable-auto-tool-choice \
|
32 |
--tool-call-parser glm45 \
|
|
|
26 |
```
|
27 |
CONTEXT_LENGTH=32768
|
28 |
vllm serve \
|
29 |
+
QuantTrio/GLM-4.6-AWQ \
|
30 |
--served-model-name My_Model \
|
31 |
--enable-auto-tool-choice \
|
32 |
--tool-call-parser glm45 \
|