update readme
Browse files
README.md
CHANGED
@@ -97,7 +97,7 @@ numactl -N 0 -m 0 \
|
|
97 |
|
98 |
These are probably the **best quants available in this size class** for `V3-0324`!
|
99 |
|
100 |
-
|
101 |
|
102 |
ubergarm made no sacrifices for token embedding, attention, dense
|
103 |
layers, or shared experts. This is possible because `ik_llama.cpp` MLA
|
@@ -220,6 +220,10 @@ Final estimate: PPL = 3.4755 +/- 0.03305
|
|
220 |
|
221 |
#### Quant Cookers Secret Recipe
|
222 |
|
|
|
|
|
|
|
|
|
223 |
```bash
|
224 |
#!/usr/bin/env bash
|
225 |
|
@@ -284,8 +288,14 @@ custom=$(
|
|
284 |
24
|
285 |
```
|
286 |
|
|
|
|
|
287 |
#### Perplexity
|
288 |
|
|
|
|
|
|
|
|
|
289 |
```bash
|
290 |
$ CUDA_VISIBLE_DEVICES="0," \
|
291 |
./build/bin/llama-perplexity \
|
@@ -701,8 +711,16 @@ llama_print_timings: total time = 2841519.57 ms / 287233 tokens
|
|
701 |
Final estimate: PPL = 3.5614 +/- 0.02001
|
702 |
```
|
703 |
|
|
|
|
|
704 |
#### Split
|
705 |
|
|
|
|
|
|
|
|
|
|
|
|
|
706 |
```bash
|
707 |
$ ./build/bin/llama-gguf-split \
|
708 |
--dry-run \
|
|
|
97 |
|
98 |
These are probably the **best quants available in this size class** for `V3-0324`!
|
99 |
|
100 |
+

|
101 |
|
102 |
ubergarm made no sacrifices for token embedding, attention, dense
|
103 |
layers, or shared experts. This is possible because `ik_llama.cpp` MLA
|
|
|
220 |
|
221 |
#### Quant Cookers Secret Recipe
|
222 |
|
223 |
+
<details>
|
224 |
+
|
225 |
+
<summary>Secret Recipe</summary>
|
226 |
+
|
227 |
```bash
|
228 |
#!/usr/bin/env bash
|
229 |
|
|
|
288 |
24
|
289 |
```
|
290 |
|
291 |
+
</details>
|
292 |
+
|
293 |
#### Perplexity
|
294 |
|
295 |
+
<details>
|
296 |
+
|
297 |
+
<summary>Perplexity Logs</summary>
|
298 |
+
|
299 |
```bash
|
300 |
$ CUDA_VISIBLE_DEVICES="0," \
|
301 |
./build/bin/llama-perplexity \
|
|
|
711 |
Final estimate: PPL = 3.5614 +/- 0.02001
|
712 |
```
|
713 |
|
714 |
+
</details>
|
715 |
+
|
716 |
#### Split
|
717 |
|
718 |
+
<details>
|
719 |
+
|
720 |
+
<summary>Split GGUF</summary>
|
721 |
+
|
722 |
+
*TODO*: Add key value metadata information before publishing.
|
723 |
+
|
724 |
```bash
|
725 |
$ ./build/bin/llama-gguf-split \
|
726 |
--dry-run \
|