ubergarm
/

DeepSeek-V3-0324-GGUF

Text Generation

Model card Files Files and versions Community

ubergarm commited on Mar 26

Commit

48ec128

·

unverified ·

1 Parent(s): 54828ae

Create README.md

Files changed (1) hide show

README.md +24 -3

README.md CHANGED Viewed

@@ -1,3 +1,24 @@
----
-license: mit
----

+---
+quantized_by: ubergarm
+pipeline_tag: text-generation
+base_model: deepseek-ai/DeepSeek-V3-0324
+license: mit
+base_model_relation: quantized
+---
+## `ik_llma.cpp` imatrix MLA Quantizations of DeepSeek-V3-0324 by deepseek-ai
+This collection of quants are intended for use with [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/) fork MLA allowing 64k context length in under 24GB VRAM for `R1` and `V3`.
+## TODO
+- [ ] Upload imatrix.dat with MLA tensors
+- [ ] Upload my fav quants using SOTA quants available on `ik_llama.cpp`
+## Big Thanks
+Big thanks to all the folks in the quant and inferencing community for sharing tips and tricks to help everyone run these fun new models!
+## References
+* [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/)
+* [ik_llama.cpp Getting Started Guide](https://github.com/ikawrakow/ik_llama.cpp/discussions/258)