Create README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,24 @@
|
|
1 |
-
---
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
quantized_by: ubergarm
|
3 |
+
pipeline_tag: text-generation
|
4 |
+
base_model: deepseek-ai/DeepSeek-V3-0324
|
5 |
+
license: mit
|
6 |
+
base_model_relation: quantized
|
7 |
+
---
|
8 |
+
|
9 |
+
## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
|
10 |
+
|
11 |
+
This collection of quants are intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork MLA allowing 64k context length in under 24GB VRAM for `R1` and `V3`.
|
12 |
+
|
13 |
+
## TODO
|
14 |
+
|
15 |
+
- [ ] Upload imatrix.dat with MLA tensors
|
16 |
+
- [ ] Upload my fav quants using SOTA quants available on `ik_llama.cpp`
|
17 |
+
|
18 |
+
## Big Thanks
|
19 |
+
|
20 |
+
Big thanks to all the folks in the quant and inferencing community for sharing tips and tricks to help everyone run these fun new models!
|
21 |
+
|
22 |
+
## References
|
23 |
+
* [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
|
24 |
+
* [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)
|