File size: 951 Bytes
f2ee58a
 
 
 
 
 
 
 
6c5d703
b72d9c4
 
6c5d703
f2ee58a
 
6c5d703
f2ee58a
 
 
 
 
 
 
 
 
 
 
6c5d703
f2ee58a
 
6c5d703
f2ee58a
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
---
license: mit
library_name: mlx
base_model: deepseek-ai/DeepSeek-V3.1
tags:
- mlx
pipeline_tag: text-generation
---
## CURRENTLY UPLOADING FILES
This notice will be removed once all files have been uploaded.
...

**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**

*q5.5bit quant typically achieves 1.141 perplexity in our testing*
| Quantization | Perplexity |
|:------------:|:----------:|
| **q2.5**     | 41.293     |
| **q3.5**     | 1.900      |
| **q4.5**     | 1.168      |
| **q5.5**     | 1.141      |
| **q6.5**     | 1.128      |
| **q8.5**     | 1.128      |

## Usage Notes

* Runs on a single M3 Ultra 512GB RAM
* Memory usage: ~480 GB
* Expect ~15 tokens/s
* Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
* For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1).