DavidAU
/

Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF

Model card Files Files and versions Community

DavidAU commited on Dec 11, 2024

Commit

2bd9925

·

verified ·

1 Parent(s): 82f2169

Update README.md

Files changed (1) hide show

README.md +5 -0

README.md CHANGED Viewed

@@ -155,6 +155,11 @@ As you increase/decrease the number of experts, you may want to adjust temp, sam
 Your quant choice(s) too will impact instruction following and output generation roughly this means the model will understand
 more nuanced instructions and output stronger generation the higher you go up in quant(s).
 Quants, Samplers, Generational steering and other topics are covered in the section below: "Highest Quality Settings..."
 <B>Censored / Uncensored / Abliterated:</B>

 Your quant choice(s) too will impact instruction following and output generation roughly this means the model will understand
 more nuanced instructions and output stronger generation the higher you go up in quant(s).
+FLASH ATTENTION ENHANCEMENT:
+As per user feedback here [ https://huggingface.co/DavidAU/Llama-3.2-8X3B-MOE-Dark-Champion-Instruct-uncensored-abliterated-18.4B-GGUF/discussions/1 ]
+I would suggest trying this model with Flash Attention "on", depending on your use case.
 Quants, Samplers, Generational steering and other topics are covered in the section below: "Highest Quality Settings..."
 <B>Censored / Uncensored / Abliterated:</B>