DeepSeek-V3 Pruning and Quantization
Collection
11 items
•
Updated
Original model: tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16.
All quants made with modification of llama.cpp based on bartowski1182-llama.cpp.
All quants made using imatrix option based on tflsxyy/DeepSeek-V3-0324-MoE-Pruner-imatrix.
IQ1_M / Q4_K / Q8_0 : 110.19 GiB(1.87 BPW)
IQ1_S / Q4_K / Q8_0 : 99.45 GiB (1.68 BPW)
1-bit
Base model
deepseek-ai/DeepSeek-V3-0324