Frowningface
/

Silly_Tavern_Presets_Database

Text Generation

Model card Files Files and versions

Frowningface commited on Apr 18

Commit

4a53a54

·

verified ·

1 Parent(s): 7609bea

Update README.md

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -57,6 +57,7 @@ SillyTavern Master File → <a rel="nofollow" href="https://huggingface.co/Nitra
 Read more here: <a rel="nofollow" href="https://ai.google.dev/gemma/docs/core/prompt-structure">Gemma formatting and system instructions</a><br>
 — Flash Attention with KV Cache < FP16 causes a big slow down for Gemma 3 Models! <a rel="nofollow" href="https://github.com/LostRuins/koboldcpp/issues/1423">(Source)</a><br>
 1. Sindre's Gemma 3 Presets <a rel="nofollow" href="https://www.reddit.com/r/SillyTavernAI/comments/1jae28l/comment/mhl3ljy/">(Source)</a><br>

 Read more here: <a rel="nofollow" href="https://ai.google.dev/gemma/docs/core/prompt-structure">Gemma formatting and system instructions</a><br>
 — Flash Attention with KV Cache < FP16 causes a big slow down for Gemma 3 Models! <a rel="nofollow" href="https://github.com/LostRuins/koboldcpp/issues/1423">(Source)</a><br>
+— TLDR: Disable Flash Attention on Gemma 3 models
 1. Sindre's Gemma 3 Presets <a rel="nofollow" href="https://www.reddit.com/r/SillyTavernAI/comments/1jae28l/comment/mhl3ljy/">(Source)</a><br>