ubergarm commited on
Commit
aa6db93
·
1 Parent(s): 8e1f011

fixup readme

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -40,7 +40,7 @@ Special mix `IQ3_K_R4`/`IQ2_K_R4` routed experts with all other layers full `q8_
40
  Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
41
 
42
  #### Custom Mixes
43
- If you have more than 48GB VRAM across multiple GPUs, consider rolling your can own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.
44
 
45
  ## Quick Start
46
  #### `ik_llama.cpp` API server for GPU+CPU
@@ -95,7 +95,7 @@ numactl -N 0 -m 0 \
95
 
96
  ## Quant Comparisons
97
 
98
- These are probably the **best quants available in this size class** for `V3-0324`!
99
 
100
  ![Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`](images/benchmarks-01.png "Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`")
101
 
 
40
  Great for CPU+GPU "troll rig" high end gamer systems e.g. 9950X 96 GB RAM + 3090TI 24 GB VRAM + Gen 5 NVMe SSD.
41
 
42
  #### Custom Mixes
43
+ If you have more than 48GB VRAM across multiple GPUs, consider rolling your own custom quants to optimize size and performance with whatever hardware you have using custom `-ot` expression. If you have less VRAM, you could make a custom quant leaner in the non routed expert layers or get 64k+ context in 24GB VRAM. Also you can use the offline repack tool if you want to do CPU only with `mmap()` still enabled.
44
 
45
  ## Quick Start
46
  #### `ik_llama.cpp` API server for GPU+CPU
 
95
 
96
  ## Quant Comparisons
97
 
98
+ These are probably among the **best quants available in this size class** for `V3-0324`!
99
 
100
  ![Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`](images/benchmarks-01.png "Benchmarks showing these quants are smaller in size yet similar in performance to the `Q8_0`")
101