Update README.md
Browse files
README.md
CHANGED
@@ -57,6 +57,7 @@ SillyTavern Master File → <a rel="nofollow" href="https://huggingface.co/Nitra
|
|
57 |
Read more here: <a rel="nofollow" href="https://ai.google.dev/gemma/docs/core/prompt-structure">Gemma formatting and system instructions</a><br>
|
58 |
|
59 |
— Flash Attention with KV Cache < FP16 causes a big slow down for Gemma 3 Models! <a rel="nofollow" href="https://github.com/LostRuins/koboldcpp/issues/1423">(Source)</a><br>
|
|
|
60 |
|
61 |
|
62 |
1. Sindre's Gemma 3 Presets <a rel="nofollow" href="https://www.reddit.com/r/SillyTavernAI/comments/1jae28l/comment/mhl3ljy/">(Source)</a><br>
|
|
|
57 |
Read more here: <a rel="nofollow" href="https://ai.google.dev/gemma/docs/core/prompt-structure">Gemma formatting and system instructions</a><br>
|
58 |
|
59 |
— Flash Attention with KV Cache < FP16 causes a big slow down for Gemma 3 Models! <a rel="nofollow" href="https://github.com/LostRuins/koboldcpp/issues/1423">(Source)</a><br>
|
60 |
+
— TLDR: Disable Flash Attention on Gemma 3 models
|
61 |
|
62 |
|
63 |
1. Sindre's Gemma 3 Presets <a rel="nofollow" href="https://www.reddit.com/r/SillyTavernAI/comments/1jae28l/comment/mhl3ljy/">(Source)</a><br>
|