Nohobby commited on
Commit
7546f08
·
verified ·
1 Parent(s): 47f2c59

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +157 -16
README.md CHANGED
@@ -1,32 +1,173 @@
1
  ---
2
- base_model:
3
- - ReadyArt/Forgotten-Safeword-24B-V2.2
4
- - mergekit-community/MS3-RP-half1
5
- - mergekit-community/MS3-RP-RP-half2
6
  library_name: transformers
7
  tags:
8
  - mergekit
9
  - merge
10
-
 
 
 
 
 
 
 
 
 
 
 
11
  ---
12
- # merge
 
 
 
 
 
 
 
 
13
 
14
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
 
 
 
 
 
 
 
15
 
16
  ## Merge Details
17
- ### Merge Method
18
 
19
- This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [ReadyArt/Forgotten-Safeword-24B-V2.2](https://huggingface.co/ReadyArt/Forgotten-Safeword-24B-V2.2) as a base.
20
 
21
- ### Models Merged
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
22
 
23
- The following models were included in the merge:
24
- * [mergekit-community/MS3-RP-half1](https://huggingface.co/mergekit-community/MS3-RP-half1)
25
- * [mergekit-community/MS3-RP-RP-half2](https://huggingface.co/mergekit-community/MS3-RP-RP-half2)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
- ### Configuration
28
 
29
- The following YAML configuration was used to produce this model:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
30
 
31
  ```yaml
32
  base_model: ReadyArt/Forgotten-Safeword-24B-V2.2
@@ -35,4 +176,4 @@ dtype: bfloat16
35
  models:
36
  - model: mergekit-community/MS3-RP-half1
37
  - model: mergekit-community/MS3-RP-RP-half2
38
- ```
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
 
5
  library_name: transformers
6
  tags:
7
  - mergekit
8
  - merge
9
+ base_model:
10
+ - unsloth/Mistral-Small-24B-Base-2501
11
+ - unsloth/Mistral-Small-24B-Instruct-2501
12
+ - trashpanda-org/MS-24B-Instruct-Mullein-v0
13
+ - trashpanda-org/Llama3-24B-Mullein-v1
14
+ - ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
15
+ - TheDrummer/Cydonia-24B-v2
16
+ - estrogen/MS2501-24b-Ink-apollo-ep2
17
+ - huihui-ai/Mistral-Small-24B-Instruct-2501-abliterated
18
+ - ToastyPigeon/ms3-roselily-rp-v2
19
+ - PocketDoc/Dans-DangerousWinds-V1.1.1-24b
20
+ - ReadyArt/Forgotten-Safeword-24B-V2.2
21
  ---
22
+ ***
23
+
24
+ ### Overview
25
+
26
+ One of the merging steps for [Tantum](https://huggingface.co/Nohobby/MS3-Tantum-24B-v0.1). Might be better than the end result
27
+
28
+ **Settings:**
29
+
30
+ Samplers: [Weird preset](https://files.catbox.moe/ccwmca.json) | [Forgotten-Safeword preset](https://huggingface.co/sleepdeprived3/Mistral-V7-Tekken-Extra-Dry)
31
 
32
+ Prompt format: Mistral-V7-Tekken (?)
33
+
34
+ I use [this](https://files.catbox.moe/daluze.json) lorebook for all chats instead of a system prompt for mistal models.
35
+
36
+ ### Quants
37
+
38
+ [Static](https://huggingface.co/mradermacher/MS-RP-whole-GGUF) | [Imatrix](https://huggingface.co/mradermacher/MS-RP-whole-i1-GGUF)
39
+
40
+ ***
41
 
42
  ## Merge Details
43
+ ### Merging steps
44
 
45
+ ## MS3-test-Merge-1
46
 
47
+ ```yaml
48
+ models:
49
+ - model: unsloth/Mistral-Small-24B-Base-2501
50
+ - model: unsloth/Mistral-Small-24B-Instruct-2501+ToastyPigeon/new-ms-rp-test-ws
51
+ parameters:
52
+ select_topk:
53
+ - value: [0.05, 0.03, 0.02, 0.02, 0.01]
54
+ - model: unsloth/Mistral-Small-24B-Instruct-2501+estrogen/MS2501-24b-Ink-ep2-adpt
55
+ parameters:
56
+ select_topk: 0.1
57
+ - model: trashpanda-org/MS-24B-Instruct-Mullein-v0
58
+ parameters:
59
+ select_topk: 0.4
60
+ base_model: unsloth/Mistral-Small-24B-Base-2501
61
+ merge_method: sce
62
+ parameters:
63
+ int8_mask: true
64
+ rescale: true
65
+ normalize: true
66
+ dtype: bfloat16
67
+ tokenizer_source: base
68
+ ```
69
 
70
+ ```yaml
71
+ dtype: bfloat16
72
+ tokenizer_source: base
73
+ merge_method: della_linear
74
+ parameters:
75
+ density: 0.55
76
+ base_model: Step1
77
+ models:
78
+ - model: unsloth/Mistral-Small-24B-Instruct-2501
79
+ parameters:
80
+ weight:
81
+ - filter: v_proj
82
+ value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
83
+ - filter: o_proj
84
+ value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1]
85
+ - filter: up_proj
86
+ value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
87
+ - filter: gate_proj
88
+ value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
89
+ - filter: down_proj
90
+ value: [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
91
+ - value: 0
92
+ - model: Step1
93
+ parameters:
94
+ weight:
95
+ - filter: v_proj
96
+ value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
97
+ - filter: o_proj
98
+ value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0]
99
+ - filter: up_proj
100
+ value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
101
+ - filter: gate_proj
102
+ value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
103
+ - filter: down_proj
104
+ value: [0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1]
105
+ - value: 1
106
 
107
+ ```
108
 
109
+ Some early MS3 merge. Not really worth using on its own. Just added it for fun.
110
+
111
+ ## RP-half1
112
+
113
+ ```yaml
114
+ models:
115
+ - model: ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
116
+ parameters:
117
+ weight: 0.2
118
+ density: 0.7
119
+ - model: trashpanda-org/Llama3-24B-Mullein-v1
120
+ parameters:
121
+ weight: 0.2
122
+ density: 0.7
123
+ - model: TheDrummer/Cydonia-24B-v2
124
+ parameters:
125
+ weight: 0.2
126
+ density: 0.7
127
+ merge_method: della_linear
128
+ base_model: Nohobby/MS3-test-Merge-1
129
+ parameters:
130
+ epsilon: 0.2
131
+ lambda: 1.1
132
+ dtype: bfloat16
133
+ tokenizer:
134
+ source: base
135
+ ```
136
+
137
+ ## RP-half2
138
+
139
+ ```yaml
140
+ base_model: Nohobby/MS3-test-Merge-1
141
+ parameters:
142
+ epsilon: 0.05
143
+ lambda: 0.9
144
+ int8_mask: true
145
+ rescale: true
146
+ normalize: false
147
+ dtype: bfloat16
148
+ tokenizer:
149
+ source: base
150
+ merge_method: della
151
+ models:
152
+ - model: estrogen/MS2501-24b-Ink-apollo-ep2
153
+ parameters:
154
+ weight: [0.1, -0.01, 0.1, -0.02, 0.1]
155
+ density: [0.6, 0.4, 0.5, 0.4, 0.6]
156
+ - model: huihui-ai/Mistral-Small-24B-Instruct-2501-abliterated
157
+ parameters:
158
+ weight: [0.02, -0.01, 0.02, -0.02, 0.01]
159
+ density: [0.45, 0.55, 0.45, 0.55, 0.45]
160
+ - model: ToastyPigeon/ms3-roselily-rp-v2
161
+ parameters:
162
+ weight: [0.01, -0.02, 0.02, -0.025, 0.01]
163
+ density: [0.45, 0.65, 0.45, 0.65, 0.45]
164
+ - model: PocketDoc/Dans-DangerousWinds-V1.1.1-24b
165
+ parameters:
166
+ weight: [0.1, -0.01, 0.1, -0.02, 0.1]
167
+ density: [0.6, 0.4, 0.5, 0.4, 0.6]
168
+ ```
169
+
170
+ ## RP-broth
171
 
172
  ```yaml
173
  base_model: ReadyArt/Forgotten-Safeword-24B-V2.2
 
176
  models:
177
  - model: mergekit-community/MS3-RP-half1
178
  - model: mergekit-community/MS3-RP-RP-half2
179
+ ```