README.md · bobchenyx/DeepSeek-V3-0324-HFK-E192-GGUF at main

metadata

quantized_by: bobchenyx
base_model:
  - tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16
  - deepseek-ai/DeepSeek-V3-0324
base_model_relation: quantized
license: mit
tags:
  - deepseek_v3
  - deepseek
  - transformers
  - GGUF
pipeline_tag: text-generation

Llamacpp Quantizations of DeepSeek-V3-0324-MoE-Pruner-E192 by tflsxyy

Original model: tflsxyy/DeepSeek-V3-0324-MoE-Pruner-E192-bf16.

All quants made with modification of llama.cpp based on bartowski1182-llama.cpp.

All quants made using imatrix option based on tflsxyy/DeepSeek-V3-0324-MoE-Pruner-imatrix.

IQ1_M / Q4_K / Q8_0 : 110.19 GiB(1.87 BPW)

IQ1_S / Q4_K / Q8_0 : 99.45 GiB (1.68 BPW)