gghfez/command-a-03-2025-AWQ

Tested with vllm==0.10.1

Usage:

vllm serve gghfez/command-a-03-2025-AWQ --port 8080 --host 0.0.0.0 --dtype bfloat16 --max-model-len 32768 -tp 4 --gpu-memory-utilization 0.9
Downloads last month
84
Safetensors
Model size
20.6B params
Tensor type
BF16
ยท
I64
ยท
I32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for gghfez/command-a-03-2025-AWQ

Quantized
(29)
this model