Fix README.md
Browse files
README.md
CHANGED
@@ -9,7 +9,7 @@ library_name: transformers
|
|
9 |
|
10 |
## Model Overview
|
11 |
|
12 |
-
This model was obtained by quantizing the weights of [deepseek-ai/DeepSeek-
|
13 |
|
14 |
Only non-shared experts within transformer blocks are compressed. Weights are quantized using a symmetric per-group scheme, with group size 128. The GPTQ algorithm is applied for quantization.
|
15 |
|
|
|
9 |
|
10 |
## Model Overview
|
11 |
|
12 |
+
This model was obtained by quantizing the weights of [deepseek-ai/DeepSeek-V3-0324](https://huggingface.co/deepseek-ai/DeepSeek-V3-0324) to INT4 data type. This optimization reduces the number of bits per parameter from 8 to 4, reducing the disk size and GPU memory requirements by approximately 50%.
|
13 |
|
14 |
Only non-shared experts within transformer blocks are compressed. Weights are quantized using a symmetric per-group scheme, with group size 128. The GPTQ algorithm is applied for quantization.
|
15 |
|