DavidAU
/

Reka-Flash-3-21B-Reasoning-Uncensored-MAX-NEO-Imatrix-GGUF

Text Generation

DeepSeek-R1-Distill

function calling

problem solving

fiction writing

plot generation

sub-plot generation

story generation

Model card Files Files and versions Community

DavidAU commited on 20 days ago

Commit

3b94a48

·

verified ·

1 Parent(s): 3a5d8e0

Update README.md

Files changed (1) hide show

README.md +10 -7

README.md CHANGED Viewed

@@ -26,18 +26,18 @@ pipeline_tag: text-generation
 UPDATE:
-Re-optimizing quants, found a better mixture. Uploading shortly.
 Mixture is strong enough that lower quants can now solve/reasoning and come up with a solution whereas NON-optimized
 may not be able to solve or take a lot longer (a lot more tokens!).
-Quick testing shows optimized can:
-  - Answer/solve at lower quant.
-  - Come up with a better answer/stronger reasoning
   - Use less tokens to "reason" ... up to 50% less.
   - Faster and smaller quant size (VS "MAX" with output tensor and embed at BF16).
-Quants - "EDGE of REASON":
 IQ1_M - Works, but limited reasoning (reasoning operates, but has a tough time (if at all) coming up with the right answer for some problems).
@@ -55,9 +55,12 @@ For TOP performance, Q6/Q8.
 ...
-All quants have be optimized with:
-- NEO Imatrix Dataset
 - BF16 Output tensor (full precision)
 I found this config worked best with this specific model and "reasoning" in general.

 UPDATE:
+Re-optimizing quants, found a better mixture. Uploading NOW...
 Mixture is strong enough that lower quants can now solve/reasoning and come up with a solution whereas NON-optimized
 may not be able to solve or take a lot longer (a lot more tokens!).
+Quick testing shows optimized quants can:
+  - Answer/solve at lower quant // solve whereas "reg quant" could not.
+  - Come up with a better answer/stronger reasoning.
   - Use less tokens to "reason" ... up to 50% less.
   - Faster and smaller quant size (VS "MAX" with output tensor and embed at BF16).
+<B>Quants - "EDGE of REASON":</B>
 IQ1_M - Works, but limited reasoning (reasoning operates, but has a tough time (if at all) coming up with the right answer for some problems).
 ...
+All quants (IQ1 right to Q6) have be optimized with:
+- NEO Imatrix Dataset.
+- BF16 Output tensor (full precision)
+Q8 (imatrix has no effect on Q8):
 - BF16 Output tensor (full precision)
 I found this config worked best with this specific model and "reasoning" in general.