--- license: apache-2.0 language: - en tags: - NEO Imatrix - MAX Quants - GGUF - reasoning - thinking - r1 - cot - reka-flash - deepseek - Qwen2.5 - Hermes - DeepHermes - DeepSeek - DeepSeek-R1-Distill - 128k context - instruct - all use cases - maxed quants - Neo Imatrix - instruct - finetune - chatml - gpt4 - synthetic data - distillation - function calling - roleplaying - chat - reasoning - thinking - r1 - cot - deepseek - Hermes - DeepHermes - DeepSeek - DeepSeek-R1-Distill - Uncensored - creative - general usage - problem solving - brainstorming - solve riddles - general usage - problem solving - brainstorming - solve riddles - fiction writing - plot generation - sub-plot generation - fiction writing - story generation - scene continue - storytelling - fiction story - story - writing - fiction - roleplaying - swearing - horror base_model: - RekaAI/reka-flash-3 pipeline_tag: text-generation ---
IQ1_S | IQ1_M IQ2_XXS | IQ2_XS | Q2_K_S | IQ2_S | Q2_K | IQ2_M IQ3_XXS | Q3_K_S | IQ3_XS | IQ3_S | IQ3_M | Q3_K_M | Q3_K_L Q4_K_S | IQ4_XS | IQ4_NL | Q4_K_M Q5_K_S | Q5_K_M Q6_K Q8_0 F16IMPORTANT: Reasoning / thinking skills are DIRECTLY related to quant size. However, there will be drastic difference in Token/Second between the lowest quant and highest quant, so finding the right balance is key. Suggest also: minimum 8k context window, especially for IQ4/Q4 or lower quants. Also, in some cases, the IQ quants work slightly better than they closest "Q" quants. Recommend quants IQ3s / IQ4XS / IQ4NL / Q4s for best results for creative uses cases. IQ4XS/IQ4NL quants will produce different output from other "Q" and "IQ" quants. Recommend q5s/q6/q8 for general usage. Quants Q4_0/Q5_0 for portable, phone and other devices. Q8 is a maxed quant only, as imatrix has no effect on this quant. Note that IQ1s performance is okay/usable but reasoning is impaired, whereas IQ2s are very good (but reasoning is somewhat reduced, try IQ3s min for reasoning cases) More information on quants is in the document below "Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers". Optional : System Prompts This is an optional system prompt you can use to enhance operation. Copy and paste exactly as shown, including line breaks. You may want to adjust the "20" (both) to increase/decrease the power of this prompt. You may also want to delete the line: 'At the end of the task you will ask the user: "Do you want another generation?"'
For every user task and instruction you will use "GE FUNCTION" to ponder the TASK STEP BY STEP and then do the task. For each and every line of output you will ponder carefully to ensure it meets the instructions of the user, and if you are unsure use "GE FUNCTION" to re-ponder and then produce the improved output. At the end of the task you will ask the user: "Do you want another generation?" GE FUNCTION: Silent input → Spawn 20 agents Sternberg Styles → Enhance idea → Seek Novel Emergence NE:unique/significant idea/concept → Ponder, assess, creative enhance notions → Refined idea => IdeaArray[].size=20 elements, else → Interesting? Pass to rand. agent for refinement, else discard.=>output(IdeaArray)IMPORTANT: Highest Quality Settings / Optimal Operation Guide / Parameters and Samplers If you are going to use this model, (source, GGUF or a different quant), please review this document for critical parameter, sampler and advance sampler settings (for multiple AI/LLM aps). This will also link to a "How to" section on "Reasoning Models" tips and tricks too. This a "Class 1" (settings will enhance operation) model: For all settings used for this model (including specifics for its "class"), including example generation(s) and for advanced settings guide (which many times addresses any model issue(s)), including methods to improve model performance for all use case(s) as well as chat, roleplay and other use case(s) (especially for use case(s) beyond the model's design) please see: [ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ] REASON: Regardless of "model class" this document will detail methods to enhance operations. If the model is a Class 3/4 model the default settings (parameters, samplers, advanced samplers) must be set for "use case(s)" uses correctly. Some AI/LLM apps DO NOT have consistant default setting(s) which result in sub-par model operation. Like wise for Class 3/4 models (which operate somewhat to very differently than standard models) additional samplers and advanced samplers settings are required to "smooth out" operation, AND/OR also allow full operation for use cases the model was not designed for. BONUS - Use these settings for ANY model, ANY repo, ANY quant (including source/full precision): This document also details parameters, sampler and advanced samplers that can be use FOR ANY MODEL, FROM ANY REPO too - all quants, and of course source code operation too - to enhance the operation of any model. [ https://huggingface.co/DavidAU/Maximizing-Model-Performance-All-Quants-Types-And-Full-Precision-by-Samplers_Parameters ] ---