Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,32 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model:
|
3 |
+
- google/gemma-3-27b-it
|
4 |
+
- google/gemma-3-27b-pt
|
5 |
+
- Columbidae/gemma-3-27b-half
|
6 |
+
library_name: transformers
|
7 |
+
tags:
|
8 |
+
- mergekit
|
9 |
+
- merge
|
10 |
+
---
|
11 |
+
# ✨G3 Glitter 12B✨
|
12 |
+
<figure>
|
13 |
+
<img src="https://huggingface.co/allura-org/Gemma-3-Glitter-12B/resolve/main/ComfyUI_02427_.png" width="600">
|
14 |
+
</figure>
|
15 |
+
|
16 |
+
A creative writing model based on Gemma 3 27B.
|
17 |
+
|
18 |
+
[Columbidae/gemma-3-27b-half](https://huggingface.co/Columbidae/gemma-3-27b-half), a 50/50 merge of 27B IT and 27B PT, was used as the base model. (This was done because of the success of [Starshine](https://huggingface.co/ToastyPigeon/Gemma-3-Starshine-12B), a 50/50 IT and PT merge.)
|
19 |
+
|
20 |
+
The inclusion of PT model does weaken the instruct, but it also weakens the censorship/hesitancy to participate in certain fictional stories. The prose also becomes more natural with less of the IT model included.
|
21 |
+
|
22 |
+
**This model does better with short and to-the-point prompts. Long, detailed system prompts will often confuse it.** (Tested with 1000-2000 token system prompts to lackluster results compared to 100-500 token prompts).
|
23 |
+
|
24 |
+
## Instruct Format
|
25 |
+
|
26 |
+
Uses Gemma2/3 instruct and context. Like Glitter 12b, this works well with `temp = 1, top-nsigma = 1.5`.
|
27 |
+
```
|
28 |
+
<start_of_turn>user
|
29 |
+
{User messages; can also put sysprompt here to use the built-in g3 training}<end_of_turn>
|
30 |
+
<start_of_turn>model
|
31 |
+
{model response}<end_of_turn>
|
32 |
+
```
|