metadata
quantized_by: bobchenyx
base_model:
- tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16
- deepseek-ai/DeepSeek-V3-0324
base_model_relation: quantized
license: mit
tags:
- deepseek_v3
- deepseek
- transformers
- GGUF
pipeline_tag: text-generation
Llamacpp Quantizations of DeepSeek-V3-0324-MoE-Pruner-E192 by tflsxyy
Original model: tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16.
All quants made with modification of llama.cpp based on bartowski1182-llama.cpp.
All quants made using imatrix option based on tflsxyy/DeepSeek-V3-0324-MoE-Pruner-imatrix.
IQ1_M / Q4_K / Q8_0 : 110.19 GiB(1.87 BPW)
IQ1_S / Q4_K / Q8_0 : 99.45 GiB (1.68 BPW)