Triangle104
/

Q2.5-32B-Slush-Q3_K_S-GGUF

@@ -13,12 +13,81 @@ datasets:
 - anthracite-org/kalo-opus-instruct-3k-filtered-no-system
 - anthracite-org/nopm_claude_writing_fixed
 base_model: crestf411/Q2.5-32B-Slush
 ---
 # Triangle104/Q2.5-32B-Slush-Q3_K_S-GGUF
 This model was converted to GGUF format from [`crestf411/Q2.5-32B-Slush`](https://huggingface.co/crestf411/Q2.5-32B-Slush) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/crestf411/Q2.5-32B-Slush) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)
@@ -57,4 +126,4 @@ Step 3: Run inference through the main binary.
 or
 ```
 ./llama-server --hf-repo Triangle104/Q2.5-32B-Slush-Q3_K_S-GGUF --hf-file q2.5-32b-slush-q3_k_s.gguf -c 2048
-```

 - anthracite-org/kalo-opus-instruct-3k-filtered-no-system
 - anthracite-org/nopm_claude_writing_fixed
 base_model: crestf411/Q2.5-32B-Slush
+license: apache-2.0
 ---
 # Triangle104/Q2.5-32B-Slush-Q3_K_S-GGUF
 This model was converted to GGUF format from [`crestf411/Q2.5-32B-Slush`](https://huggingface.co/crestf411/Q2.5-32B-Slush) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/crestf411/Q2.5-32B-Slush) for more details on the model.
+---
+Model details:
+-
+Slush is a two-stage model trained with high LoRA dropout, where stage 1 is a pretraining continuation on the base model, aimed at boosting the model's creativity and writing capabilities. This is then merged into the instruction tune model, and stage 2 is a fine tuning step on top of this to further enhance its roleplaying capabilities and/or to repair any damage caused in the stage 1 merge.
+This is still early stage. As always, feedback is welcome, and begone if you demand perfection.
+The second stage, like the Sunfall series, follows the Silly Tavern preset (ChatML), so ymmv in particular if you use some other tool and/or preset.
+Parameter suggestions
+I did all my testing with temp 1, min-p 0.1, DRY 0.8, but enabled XTC as context grew and/or the model started saying "the same stuff".
+Qwen 2.5 32B Instruct (vanilla) has a strong tendency to start speaking for the user, especially in narrator scenarios. I was unable to properly train this out of the model completely, so you may want to add e.g. "\nYou" as a stopping string, and enable "trim incomplete sentences", especially if you have banned sentences.
+The model has a tendency to add an unnecesary final paragraph to responses during roleplay, sort of like a "summary" of how the character is feeling. Keeping it is OK, but it may snowball quickly. Hoping to address this but not sure how.
+Training details
+    Stage 1 (continued pretraining)
+        Target: Qwen/Qwen2.5-32B (resulting LoRA merged into Qwen/Qwen2.5-32B-Instruct)
+        LoRA dropout 0.5 (motivation)
+        LoRA rank 32, alpha 64 (motivation)
+        LR cosine 4e-6
+        LoRA+ with LR Ratio: 15
+        Context size: 8192
+        Gradient accumulation steps: 4
+        Epochs: 1
+    Stage 2 (fine tune)
+        Target: Stage 1 model
+        LoRA dropout 0.5
+        LoRA rank 32, alpha 64
+        LR cosine 5e-6 (min 5e-7)
+        LoRA+ with LR Ratio: 15
+        Context size: 16384
+        Gradient accumulation steps: 4
+        Epochs: 1
+Merge Details
+Merge Method
+This model was merged using the TIES merge method.
+Configuration
+The following YAML configuration was used to produce this model:
+models:
+  - model: stage1-model
+    parameters:
+      weight: 1
+      density: 1
+  - model: stage2-model
+    parameters:
+      weight: 1
+      density: 1
+  - model: Qwen/Qwen2.5-32B-Instruct
+    parameters:
+      weight: 0.9
+      density: 0.9
+merge_method: ties
+base_model: Qwen/Qwen2.5-32B
+parameters:
+  weight: 0.9
+  density: 0.9
+  normalize: true
+  int8_mask: true
+tokenizer_source: Qwen/Qwen2.5-32B-Instruct
+dtype: bfloat16
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)
 or
 ```
 ./llama-server --hf-repo Triangle104/Q2.5-32B-Slush-Q3_K_S-GGUF --hf-file q2.5-32b-slush-q3_k_s.gguf -c 2048
+```