QuantTrio
/

GLM-4.5-Air-AWQ-FP16Mix

Text Generation

quantization fix

4-bit precision

Model card Files Files and versions Community

JunHowie commited on 25 days ago

Commit

06af90f

·

verified ·

1 Parent(s): 979d481

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -8,11 +8,11 @@ tags:
 - quantization fix
 - vLLM
 base_model:
-  - ZhipuAI/GLM-4.5-Air
 base_model_relation: quantized
 ---
 # GLM-4.5-Air-AWQ-FP16Mix
-Base model: [ZhipuAI/GLM-4.5-Air](https://www.modelscope.cn/models/ZhipuAI/GLM-4.5-Air)
 ### 【vLLM Single Node with 8 GPUs Startup Command】
 <i>Note: You must use `--enable-expert-parallel` to start this model, otherwise the expert tensor TP will not divide evenly. This is required even for 2 GPUs.</i>

 - quantization fix
 - vLLM
 base_model:
+  - zai-org/GLM-4.5-Air
 base_model_relation: quantized
 ---
 # GLM-4.5-Air-AWQ-FP16Mix
+Base model: [zai-org/GLM-4.5-Air](https://huggingface.co/zai-org/GLM-4.5-Air)
 ### 【vLLM Single Node with 8 GPUs Startup Command】
 <i>Note: You must use `--enable-expert-parallel` to start this model, otherwise the expert tensor TP will not divide evenly. This is required even for 2 GPUs.</i>