YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Deepseek-V3-Lowrank80p


This repository provides the low-rank version of Deepseek-V3, the route expert weights are recovered using low-rank approximation (reduce 20% weights).

Average Score of MMLU (%) Average Score of GSM8K (%)
deepseek/DeepSeek-V3 87.7 94.1
Deepseek-V3-Lowrank80p 86.7 94.5

Reference Implementations

  • gh-efforts/DeepSeek-V3
  • gh-efforts/sglang:
    • sample command:
    DEEPSEEK_RANK=1280 DEEPSEEK_SCALE_RANK=10 python3 -m sglang.launch_server --model-path /data1/asvd_dskv3_packed_63_backup --host 0.0.0.0 --port 40000 --tp-size 8 --enable-ep-moe --trust-remote-code --mem-fraction-static 0.9 --disable-cuda-graph
    
  • xutianyi1999/mistral.rs
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support