ubergarm commited on
Commit
2b89e41
·
1 Parent(s): 48ec128

update README

Browse files
Files changed (1) hide show
  1. README.md +10 -5
README.md CHANGED
@@ -8,17 +8,22 @@ base_model_relation: quantized
8
 
9
  ## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
10
 
11
- This collection of quants are intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork MLA allowing 64k context length in under 24GB VRAM for `R1` and `V3`.
 
 
12
 
13
  ## TODO
14
 
15
- - [ ] Upload imatrix.dat with MLA tensors
16
- - [ ] Upload my fav quants using SOTA quants available on `ik_llama.cpp`
17
 
18
  ## Big Thanks
 
 
 
19
 
20
- Big thanks to all the folks in the quant and inferencing community for sharing tips and tricks to help everyone run these fun new models!
21
 
22
  ## References
23
  * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
24
- * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)
 
8
 
9
  ## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
10
 
11
+ This quant collection is intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork.
12
+
13
+ All these quants support MLA to allow 32k (some even 64k) context length in under 24GB GPU VRAM for `R1` and `V3` while offloading MoE layers to CPU RAM.
14
 
15
  ## TODO
16
 
17
+ - [ ] Upload imatrix.dat computed with MLA tensors
18
+ - [ ] Upload my fav SOTA quants available on `ik_llama.cpp` for hybrid GPU+CPU and pure CPU optimized inferencing.
19
 
20
  ## Big Thanks
21
+ Big thanks to all the folks in the quanting and inferencing community here and on `r/LocalLLaMA` for sharing tips and tricks to help each other access all the fun new models!
22
+
23
+ Shout out to the **Level1Techs** crew, community [Forums](https://forum.level1techs.com/t/deepseek-deep-dive-r1-at-home/225826), [YouTube Channel](https://www.youtube.com/@Level1Techs), and for providing big hardware expertise and access to run these experiments!!!
24
 
25
+ Finally, I'm still learning the ropes, so please be patient and we can learn together. Thanks!
26
 
27
  ## References
28
  * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
29
+ * [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)