ubergarm
/

DeepSeek-V3-0324-GGUF

Text Generation

Model card Files Files and versions Community

ubergarm commited on Mar 26

Commit

2b89e41

·

1 Parent(s): 48ec128

update README

Files changed (1) hide show

README.md +10 -5

README.md CHANGED Viewed

@@ -8,17 +8,22 @@ base_model_relation: quantized
 ## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
-This collection of quants are intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork MLA allowing 64k context length in under 24GB VRAM for `R1` and `V3`.
 ## TODO
-- [ ] Upload imatrix.dat with MLA tensors
-- [ ] Upload my fav quants using SOTA quants available on `ik_llama.cpp`
 ## Big Thanks
-Big thanks to all the folks in the quant and inferencing community for sharing tips and tricks to help everyone run these fun new models!
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
-* [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)

 ## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
+This quant collection is intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork.
+All these quants support MLA to allow 32k (some even 64k) context length in under 24GB GPU VRAM for `R1` and `V3` while offloading MoE layers to CPU RAM.
 ## TODO
+- [ ] Upload imatrix.dat computed with MLA tensors
+- [ ] Upload my fav SOTA quants available on `ik_llama.cpp` for hybrid GPU+CPU and pure CPU optimized inferencing.
 ## Big Thanks
+Big thanks to all the folks in the quanting and inferencing community here and on `r/LocalLLaMA` for sharing tips and tricks to help each other access all the fun new models!
+Shout out to the **Level1Techs** crew, community [Forums](https://forum.level1techs.com/t/deepseek-deep-dive-r1-at-home/225826), [YouTube Channel](https://www.youtube.com/@Level1Techs), and for providing big hardware expertise and access to run these experiments!!!
+Finally, I'm still learning the ropes, so please be patient and we can learn together. Thanks!
 ## References
 * [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
+* [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)