DavidAU commited on
Commit
3a5d8e0
·
verified ·
1 Parent(s): 8d24afd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -2
README.md CHANGED
@@ -35,11 +35,36 @@ Quick testing shows optimized can:
35
  - Answer/solve at lower quant.
36
  - Come up with a better answer/stronger reasoning
37
  - Use less tokens to "reason" ... up to 50% less.
38
- - Faster and smaller quant size.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
 
40
  ---
41
 
42
- Reka's excellent reasoning model with MAXed quants, and NEO Imatrix dataset.
43
 
44
  128k context.
45
 
 
35
  - Answer/solve at lower quant.
36
  - Come up with a better answer/stronger reasoning
37
  - Use less tokens to "reason" ... up to 50% less.
38
+ - Faster and smaller quant size (VS "MAX" with output tensor and embed at BF16).
39
+
40
+ Quants - "EDGE of REASON":
41
+
42
+ IQ1_M - Works, but limited reasoning (reasoning operates, but has a tough time (if at all) coming up with the right answer for some problems).
43
+
44
+ ...
45
+
46
+ IQ2_S - Moderate reasoning ; impressive performance for both reasoning AND this quant.
47
+
48
+ ...
49
+
50
+ For best performance IQ3_M or IQ4_XS/NL or Q4s.
51
+
52
+ ...
53
+
54
+ For TOP performance, Q6/Q8.
55
+
56
+ ...
57
+
58
+ All quants have be optimized with:
59
+
60
+ - NEO Imatrix Dataset
61
+ - BF16 Output tensor (full precision)
62
+
63
+ I found this config worked best with this specific model and "reasoning" in general.
64
 
65
  ---
66
 
67
+ Reka's excellent reasoning model with MAX (level 1) quants, and NEO Imatrix dataset.
68
 
69
  128k context.
70