DavidAU
/

Reka-Flash-3-21B-Reasoning-Uncensored-MAX-NEO-Imatrix-GGUF

Text Generation

DeepSeek-R1-Distill

function calling

problem solving

fiction writing

plot generation

sub-plot generation

story generation

Model card Files Files and versions Community

DavidAU commited on Mar 20

Commit

3a5d8e0

·

verified ·

1 Parent(s): 8d24afd

Update README.md

Files changed (1) hide show

README.md +27 -2

README.md CHANGED Viewed

@@ -35,11 +35,36 @@ Quick testing shows optimized can:
   - Answer/solve at lower quant.
   - Come up with a better answer/stronger reasoning
   - Use less tokens to "reason" ... up to 50% less.
-  - Faster and smaller quant size.
 ---
-Reka's excellent reasoning model with MAXed quants, and NEO Imatrix dataset.
 128k context.

   - Answer/solve at lower quant.
   - Come up with a better answer/stronger reasoning
   - Use less tokens to "reason" ... up to 50% less.
+  - Faster and smaller quant size (VS "MAX" with output tensor and embed at BF16).
+Quants - "EDGE of REASON":
+IQ1_M - Works, but limited reasoning (reasoning operates, but has a tough time (if at all) coming up with the right answer for some problems).
+...
+IQ2_S - Moderate reasoning ; impressive performance for both reasoning AND this quant.
+...
+For best performance IQ3_M or IQ4_XS/NL or Q4s.
+...
+For TOP performance, Q6/Q8.
+...
+All quants have be optimized with:
+- NEO Imatrix Dataset
+- BF16 Output tensor (full precision)
+I found this config worked best with this specific model and "reasoning" in general.
 ---
+Reka's excellent reasoning model with MAX (level 1) quants, and NEO Imatrix dataset.
 128k context.