You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Deepseek-V3-Lowrank80p


This repository provides the low-rank version of Deepseek-V3, the route expert weight are recovered using low-rank approximation (reduce 20% weights).

Average Score of MMLU (%) Average Score of GSM8K (%)
deepseek/DeepSeek-V3 87.7 94.1
Deepseek-V3-Lowrank80p 86.7 94.5

Reference Implementations

  • gh-efforts/DeepSeek-V3
  • gh-efforts/sglang:
    • sample command:
    DEEPSEEK_RANK=1280 DEEPSEEK_SCALE_RANK=10 python3 -m sglang.launch_server --model-path /data1/asvd_dskv3_packed_63_backup --host 0.0.0.0 --port 40000 --tp-size 8 --enable-ep-moe --trust-remote-code --mem-fraction-static 0.9 --disable-cuda-graph
    
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support