File size: 951 Bytes
f2ee58a 6c5d703 b72d9c4 6c5d703 f2ee58a 6c5d703 f2ee58a 6c5d703 f2ee58a 6c5d703 f2ee58a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 |
---
license: mit
library_name: mlx
base_model: deepseek-ai/DeepSeek-V3.1
tags:
- mlx
pipeline_tag: text-generation
---
## CURRENTLY UPLOADING FILES
This notice will be removed once all files have been uploaded.
...
**See DeepSeek-V3.1 5.5bit MLX in action - [demonstration video](https://youtube.com/xcreate)**
*q5.5bit quant typically achieves 1.141 perplexity in our testing*
| Quantization | Perplexity |
|:------------:|:----------:|
| **q2.5** | 41.293 |
| **q3.5** | 1.900 |
| **q4.5** | 1.168 |
| **q5.5** | 1.141 |
| **q6.5** | 1.128 |
| **q8.5** | 1.128 |
## Usage Notes
* Runs on a single M3 Ultra 512GB RAM
* Memory usage: ~480 GB
* Expect ~15 tokens/s
* Built with a modified version of [MLX](https://github.com/ml-explore/mlx) 0.26
* For more details see [demonstration video](https://youtube.com/xcreate) or visit [DeepSeek-V3.1](https://huggingface.co/deepseek-ai/DeepSeek-V3.1). |