Text Generation
GGUF
English
NEO Imatrix
MAX Quants
GGUF
uncensored
reasoning
thinking
r1
cot
reka-flash
deepseek
Qwen2.5
Hermes
DeepHermes
DeepSeek
DeepSeek-R1-Distill
128k context
instruct
all use cases
maxed quants
Neo Imatrix
finetune
chatml
gpt4
synthetic data
distillation
function calling
roleplaying
chat
Uncensored
creative
general usage
problem solving
brainstorming
solve riddles
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
story
writing
fiction
swearing
horror
imatrix
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -35,11 +35,36 @@ Quick testing shows optimized can:
|
|
35 |
- Answer/solve at lower quant.
|
36 |
- Come up with a better answer/stronger reasoning
|
37 |
- Use less tokens to "reason" ... up to 50% less.
|
38 |
-
- Faster and smaller quant size.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
39 |
|
40 |
---
|
41 |
|
42 |
-
Reka's excellent reasoning model with
|
43 |
|
44 |
128k context.
|
45 |
|
|
|
35 |
- Answer/solve at lower quant.
|
36 |
- Come up with a better answer/stronger reasoning
|
37 |
- Use less tokens to "reason" ... up to 50% less.
|
38 |
+
- Faster and smaller quant size (VS "MAX" with output tensor and embed at BF16).
|
39 |
+
|
40 |
+
Quants - "EDGE of REASON":
|
41 |
+
|
42 |
+
IQ1_M - Works, but limited reasoning (reasoning operates, but has a tough time (if at all) coming up with the right answer for some problems).
|
43 |
+
|
44 |
+
...
|
45 |
+
|
46 |
+
IQ2_S - Moderate reasoning ; impressive performance for both reasoning AND this quant.
|
47 |
+
|
48 |
+
...
|
49 |
+
|
50 |
+
For best performance IQ3_M or IQ4_XS/NL or Q4s.
|
51 |
+
|
52 |
+
...
|
53 |
+
|
54 |
+
For TOP performance, Q6/Q8.
|
55 |
+
|
56 |
+
...
|
57 |
+
|
58 |
+
All quants have be optimized with:
|
59 |
+
|
60 |
+
- NEO Imatrix Dataset
|
61 |
+
- BF16 Output tensor (full precision)
|
62 |
+
|
63 |
+
I found this config worked best with this specific model and "reasoning" in general.
|
64 |
|
65 |
---
|
66 |
|
67 |
+
Reka's excellent reasoning model with MAX (level 1) quants, and NEO Imatrix dataset.
|
68 |
|
69 |
128k context.
|
70 |
|