meituan
/

DeepSeek-R1-Channel-INT8

Text Generation

text-generation-inference

Model card Files Files and versions Community

Resources

View closed (4)

can support ignore layers in w8a8_int8 quantization setting?

#12 opened about 2 months ago by

Can I run this model on AMD GPU? Or is it only compatible for Nvidia GPU?

#11 opened 2 months ago by

Update inference/bf16_cast_channel_int8.py

#10 opened 3 months ago by

Update config.json

#9 opened 3 months ago by

how to achieve 2500 tps throughput？

#8 opened 3 months ago by

can this model run with `ollama` with `pure cpu` model?

#7 opened 3 months ago by

Add `quantization_config` in config.json?

#4 opened 3 months ago by

运行channel INT8后sglang报错OOM

#3 opened 3 months ago by