GaLLM-14B-v0.2的GPTQ-Int4量化版,使用方法相同

推荐使用vllm部署,然后使用OpenAI格式的API访问:

vllm serve CjangCjengh/GaLLM-14B-v0.2-GPTQ-Int4 --port <your_port>
Downloads last month
22
Safetensors
Model size
3.33B params
Tensor type
I32
·
FP16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for CjangCjengh/GaLLM-14B-v0.2-GPTQ-Int4