ubergarm
/

DeepSeek-V3-0324-GGUF

Text Generation

Model card Files Files and versions Community

ubergarm commited on Apr 1

Commit

52485c7

·

1 Parent(s): e1ce4cd

update readme

Files changed (1) hide show

README.md +19 -1

README.md CHANGED Viewed

@@ -97,7 +97,7 @@ numactl -N 0 -m 0 \
 These are probably the **best quants available in this size class** for `V3-0324`!
-[!][Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`](benchmarks-01.png "Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`")
 ubergarm made no sacrifices for token embedding, attention, dense
 layers, or shared experts. This is possible because `ik_llama.cpp` MLA
@@ -220,6 +220,10 @@ Final estimate: PPL = 3.4755 +/- 0.03305
 #### Quant Cookers Secret Recipe
 ```bash
 #!/usr/bin/env bash
@@ -284,8 +288,14 @@ custom=$(
     24
 ```
 #### Perplexity
 ```bash
 $ CUDA_VISIBLE_DEVICES="0," \
 ./build/bin/llama-perplexity \
@@ -701,8 +711,16 @@ llama_print_timings:       total time = 2841519.57 ms / 287233 tokens
 Final estimate: PPL = 3.5614 +/- 0.02001
 ```
 #### Split
 ```bash
 $ ./build/bin/llama-gguf-split \
     --dry-run \

 These are probably the **best quants available in this size class** for `V3-0324`!
+![Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`](benchmarks-01.png "Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`")
 ubergarm made no sacrifices for token embedding, attention, dense
 layers, or shared experts. This is possible because `ik_llama.cpp` MLA
 #### Quant Cookers Secret Recipe
+<details>
+<summary>Secret Recipe</summary>
 ```bash
 #!/usr/bin/env bash
     24
 ```
+</details>
 #### Perplexity
+<details>
+<summary>Perplexity Logs</summary>
 ```bash
 $ CUDA_VISIBLE_DEVICES="0," \
 ./build/bin/llama-perplexity \
 Final estimate: PPL = 3.5614 +/- 0.02001
 ```
+</details>
 #### Split
+<details>
+<summary>Split GGUF</summary>
+*TODO*: Add key value metadata information before publishing.
 ```bash
 $ ./build/bin/llama-gguf-split \
     --dry-run \