lejelly commited on
Commit
8a1b83a
·
verified ·
1 Parent(s): 771a7fd

Upload folder using huggingface_hub

Browse files
README.md ADDED
@@ -0,0 +1,61 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - merge
4
+ - parameter_wise
5
+ - llm-adamerge
6
+ base_model: mistralai/Mistral-7B-v0.1
7
+ ---
8
+
9
+ # Merged Model using LLM-AdaMerge (parameter_wise)
10
+
11
+ This model was created by merging multiple fine-tuned models using the LLM-AdaMerge approach with parameter_wise merging.
12
+
13
+ ## Merge Details
14
+
15
+ - **Merge Type**: parameter_wise
16
+ - **Base Model**: mistralai/Mistral-7B-v0.1
17
+ - **Number of Models Merged**: 3
18
+ - **Models Merged**: instruct, math, code
19
+ - **Final Training Loss**: N/A
20
+ - **Training Epochs**: 0
21
+
22
+ ## Lambda Coefficients
23
+
24
+ The following lambda coefficients were learned during training:
25
+
26
+
27
+ ### Parameter-wise Lambdas
28
+ This model uses parameter-wise lambda coefficients. Total parameters with individual lambdas: 291
29
+
30
+
31
+ See the uploaded `learned_lambdas.json` file for detailed parameter-wise coefficients.
32
+
33
+ ## Usage
34
+
35
+ ```python
36
+ from transformers import AutoModelForCausalLM, AutoTokenizer
37
+
38
+ model = AutoModelForCausalLM.from_pretrained("your-username/model-name")
39
+ tokenizer = AutoTokenizer.from_pretrained("your-username/model-name")
40
+
41
+ # Use the model
42
+ inputs = tokenizer("Hello, how are you?", return_tensors="pt")
43
+ outputs = model.generate(**inputs)
44
+ print(tokenizer.decode(outputs[0]))
45
+ ```
46
+
47
+ ## Training Configuration
48
+
49
+ See the uploaded `training_config.json` file for detailed training configuration.
50
+
51
+ ## Citation
52
+
53
+ If you use this model, please cite the LLM-AdaMerge paper:
54
+
55
+ ```bibtex
56
+ @article{llmadamerge2024,
57
+ title={LLM-AdaMerge: Adaptive Model Merging for Large Language Models},
58
+ author={...},
59
+ year={2024}
60
+ }
61
+ ```
config.json ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "MistralForCausalLM"
4
+ ],
5
+ "attention_dropout": 0.0,
6
+ "bos_token_id": 1,
7
+ "eos_token_id": 2,
8
+ "head_dim": null,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 4096,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 14336,
13
+ "max_position_embeddings": 32768,
14
+ "model_type": "mistral",
15
+ "num_attention_heads": 32,
16
+ "num_hidden_layers": 32,
17
+ "num_key_value_heads": 8,
18
+ "rms_norm_eps": 1e-05,
19
+ "rope_theta": 10000.0,
20
+ "sliding_window": 4096,
21
+ "tie_word_embeddings": false,
22
+ "torch_dtype": "float16",
23
+ "transformers_version": "4.52.4",
24
+ "use_cache": true,
25
+ "vocab_size": 32000
26
+ }
generation_config.json ADDED
@@ -0,0 +1,6 @@
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "bos_token_id": 1,
4
+ "eos_token_id": 2,
5
+ "transformers_version": "4.52.4"
6
+ }
learned_lambdas.json ADDED
@@ -0,0 +1,1759 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "lambdas": [
3
+ [
4
+ -0.008600689470767975,
5
+ -0.27110379934310913,
6
+ 0.4651559889316559
7
+ ],
8
+ [
9
+ -0.03290576487779617,
10
+ 0.17893846333026886,
11
+ 0.8713870048522949
12
+ ],
13
+ [
14
+ 0.15764102339744568,
15
+ -0.07096076756715775,
16
+ 0.1824948787689209
17
+ ],
18
+ [
19
+ 0.535243809223175,
20
+ 0.8895148038864136,
21
+ 0.8248310685157776
22
+ ],
23
+ [
24
+ 0.32287347316741943,
25
+ -0.27368971705436707,
26
+ 0.4297662377357483
27
+ ],
28
+ [
29
+ -0.10271385312080383,
30
+ 0.1222417801618576,
31
+ -0.05851483345031738
32
+ ],
33
+ [
34
+ 0.5892490744590759,
35
+ 0.5596224665641785,
36
+ -0.09430505335330963
37
+ ],
38
+ [
39
+ 0.5037448406219482,
40
+ 0.5673904418945312,
41
+ 0.44907820224761963
42
+ ],
43
+ [
44
+ 0.04533325880765915,
45
+ -0.012348051182925701,
46
+ 0.46244335174560547
47
+ ],
48
+ [
49
+ 0.37038925290107727,
50
+ -0.4712038040161133,
51
+ 0.009666825644671917
52
+ ],
53
+ [
54
+ -0.3951989710330963,
55
+ 0.011475293897092342,
56
+ 0.7200853228569031
57
+ ],
58
+ [
59
+ 0.4634708762168884,
60
+ 0.03281472995877266,
61
+ 0.6417493224143982
62
+ ],
63
+ [
64
+ 0.15638893842697144,
65
+ 0.6444075703620911,
66
+ 0.33211857080459595
67
+ ],
68
+ [
69
+ 0.005255121272057295,
70
+ 0.6458375453948975,
71
+ 0.5695016980171204
72
+ ],
73
+ [
74
+ 0.5011530518531799,
75
+ -0.9443197250366211,
76
+ 0.2154204100370407
77
+ ],
78
+ [
79
+ 0.724826991558075,
80
+ -0.6742737293243408,
81
+ 0.7136117815971375
82
+ ],
83
+ [
84
+ 1.0802881717681885,
85
+ 0.3505321145057678,
86
+ 0.7226514220237732
87
+ ],
88
+ [
89
+ 0.44733840227127075,
90
+ 0.49678224325180054,
91
+ 0.14433912932872772
92
+ ],
93
+ [
94
+ -0.41781729459762573,
95
+ 0.07401419430971146,
96
+ 1.1384369134902954
97
+ ],
98
+ [
99
+ -0.0549161396920681,
100
+ 0.46900907158851624,
101
+ 0.10407879203557968
102
+ ],
103
+ [
104
+ 0.21801219880580902,
105
+ 0.17864975333213806,
106
+ 0.0976036787033081
107
+ ],
108
+ [
109
+ -0.3787577152252197,
110
+ 0.5172115564346313,
111
+ -0.10581710189580917
112
+ ],
113
+ [
114
+ 0.42402195930480957,
115
+ -0.6051362156867981,
116
+ 0.8535587787628174
117
+ ],
118
+ [
119
+ -0.0035712181124836206,
120
+ 0.2822836637496948,
121
+ -0.20307590067386627
122
+ ],
123
+ [
124
+ -0.3496381640434265,
125
+ -0.09288095682859421,
126
+ -0.007612093351781368
127
+ ],
128
+ [
129
+ 0.5201497077941895,
130
+ -0.51421058177948,
131
+ 0.01208819355815649
132
+ ],
133
+ [
134
+ 0.3452093005180359,
135
+ 0.452639639377594,
136
+ 0.3317156732082367
137
+ ],
138
+ [
139
+ -0.05814714357256889,
140
+ 0.28733497858047485,
141
+ 0.2111409306526184
142
+ ],
143
+ [
144
+ 0.1041894257068634,
145
+ 0.8851682543754578,
146
+ 0.7849133610725403
147
+ ],
148
+ [
149
+ 0.17876780033111572,
150
+ 1.0249152183532715,
151
+ 0.8022587299346924
152
+ ],
153
+ [
154
+ 0.2981276214122772,
155
+ 0.5915073156356812,
156
+ 0.7403908371925354
157
+ ],
158
+ [
159
+ -0.023820744827389717,
160
+ 0.5682498812675476,
161
+ 0.7562360167503357
162
+ ],
163
+ [
164
+ 0.18472808599472046,
165
+ -0.38978439569473267,
166
+ 0.3726218640804291
167
+ ],
168
+ [
169
+ -0.5412111878395081,
170
+ -0.8462530374526978,
171
+ 0.07724694162607193
172
+ ],
173
+ [
174
+ -0.002844978356733918,
175
+ 0.6006382703781128,
176
+ -0.13113689422607422
177
+ ],
178
+ [
179
+ 0.575594961643219,
180
+ 0.03936400264501572,
181
+ 0.5742295980453491
182
+ ],
183
+ [
184
+ -0.17656289041042328,
185
+ 0.8008044958114624,
186
+ 0.5196676850318909
187
+ ],
188
+ [
189
+ 0.392094224691391,
190
+ 0.04320698603987694,
191
+ 0.023837342858314514
192
+ ],
193
+ [
194
+ -0.39720162749290466,
195
+ -0.491166353225708,
196
+ -0.5983253717422485
197
+ ],
198
+ [
199
+ -0.11212798207998276,
200
+ 0.8369597792625427,
201
+ 0.09399152547121048
202
+ ],
203
+ [
204
+ -0.3240719437599182,
205
+ 0.6937462091445923,
206
+ 0.5937374830245972
207
+ ],
208
+ [
209
+ -0.072868712246418,
210
+ -0.4130878448486328,
211
+ 0.503585696220398
212
+ ],
213
+ [
214
+ 0.008192034438252449,
215
+ 1.0617002248764038,
216
+ 0.41240212321281433
217
+ ],
218
+ [
219
+ 0.10066752880811691,
220
+ -0.949410080909729,
221
+ 0.20881889760494232
222
+ ],
223
+ [
224
+ -0.13509775698184967,
225
+ 0.740580141544342,
226
+ -0.3912171721458435
227
+ ],
228
+ [
229
+ -0.6682857275009155,
230
+ -0.3668968975543976,
231
+ -0.4059883952140808
232
+ ],
233
+ [
234
+ -0.23979230225086212,
235
+ -0.04724369943141937,
236
+ -0.1683410406112671
237
+ ],
238
+ [
239
+ -0.030199818313121796,
240
+ -0.06801427155733109,
241
+ -0.2558162212371826
242
+ ],
243
+ [
244
+ 0.4532090127468109,
245
+ 0.51909339427948,
246
+ -0.06161358952522278
247
+ ],
248
+ [
249
+ 0.4220469892024994,
250
+ -0.30356091260910034,
251
+ 1.0478200912475586
252
+ ],
253
+ [
254
+ 0.3710097670555115,
255
+ 0.5032548308372498,
256
+ -0.3149319291114807
257
+ ],
258
+ [
259
+ -0.3288426399230957,
260
+ 0.7235692143440247,
261
+ 0.09411199390888214
262
+ ],
263
+ [
264
+ 0.6833317875862122,
265
+ 0.5491123795509338,
266
+ -0.13000386953353882
267
+ ],
268
+ [
269
+ -0.18602947890758514,
270
+ 0.30868157744407654,
271
+ 0.4305288791656494
272
+ ],
273
+ [
274
+ -0.010125085711479187,
275
+ 0.28111228346824646,
276
+ 0.2414601594209671
277
+ ],
278
+ [
279
+ -0.39355704188346863,
280
+ 0.17274145781993866,
281
+ 0.2901214361190796
282
+ ],
283
+ [
284
+ 0.4727241098880768,
285
+ 0.7610338926315308,
286
+ 0.38114458322525024
287
+ ],
288
+ [
289
+ -0.17738765478134155,
290
+ 0.8454529047012329,
291
+ 0.49150586128234863
292
+ ],
293
+ [
294
+ 0.18232625722885132,
295
+ 0.010407311841845512,
296
+ 0.8297613263130188
297
+ ],
298
+ [
299
+ 0.12667511403560638,
300
+ 0.06083226576447487,
301
+ -0.8528534173965454
302
+ ],
303
+ [
304
+ 0.36976495385169983,
305
+ 0.9643232226371765,
306
+ 0.9328480362892151
307
+ ],
308
+ [
309
+ -0.15158019959926605,
310
+ 0.7440004348754883,
311
+ 0.436826229095459
312
+ ],
313
+ [
314
+ -0.08947861939668655,
315
+ 0.3323182165622711,
316
+ 0.4373578131198883
317
+ ],
318
+ [
319
+ 0.16054478287696838,
320
+ 0.12135281413793564,
321
+ 0.09848786890506744
322
+ ],
323
+ [
324
+ 0.12028805911540985,
325
+ 0.35101237893104553,
326
+ 0.04929043725132942
327
+ ],
328
+ [
329
+ 1.0261940956115723,
330
+ 0.6754860281944275,
331
+ -0.07319711148738861
332
+ ],
333
+ [
334
+ -0.06602616608142853,
335
+ 0.4309370219707489,
336
+ -0.25024744868278503
337
+ ],
338
+ [
339
+ 0.28278306126594543,
340
+ 0.7632055878639221,
341
+ -0.4026739001274109
342
+ ],
343
+ [
344
+ 0.20000706613063812,
345
+ -0.05334228649735451,
346
+ 0.683163583278656
347
+ ],
348
+ [
349
+ -0.2022043764591217,
350
+ 0.3802075982093811,
351
+ 0.9503597021102905
352
+ ],
353
+ [
354
+ -0.36965954303741455,
355
+ 0.28152400255203247,
356
+ -0.019405817613005638
357
+ ],
358
+ [
359
+ 0.0975869670510292,
360
+ 0.29181671142578125,
361
+ -0.12390828877687454
362
+ ],
363
+ [
364
+ -0.211451455950737,
365
+ 0.004585533868521452,
366
+ -0.029675688594579697
367
+ ],
368
+ [
369
+ -0.12587527930736542,
370
+ 0.4573831260204315,
371
+ 0.22337642312049866
372
+ ],
373
+ [
374
+ -0.04365545138716698,
375
+ 0.69129478931427,
376
+ 0.5938308238983154
377
+ ],
378
+ [
379
+ 0.09625856578350067,
380
+ -0.5298608541488647,
381
+ -0.46198388934135437
382
+ ],
383
+ [
384
+ 0.9985604882240295,
385
+ 0.5789576172828674,
386
+ -0.38657164573669434
387
+ ],
388
+ [
389
+ 0.4649677574634552,
390
+ 0.600328803062439,
391
+ 0.3077949583530426
392
+ ],
393
+ [
394
+ -0.0432770811021328,
395
+ 0.5958781242370605,
396
+ 0.5499005913734436
397
+ ],
398
+ [
399
+ 0.8883084058761597,
400
+ 1.05435049533844,
401
+ 0.6483041048049927
402
+ ],
403
+ [
404
+ 0.3577224314212799,
405
+ 0.29181671142578125,
406
+ 0.31332382559776306
407
+ ],
408
+ [
409
+ 0.3399612605571747,
410
+ 0.6493778228759766,
411
+ -0.11633078008890152
412
+ ],
413
+ [
414
+ -0.47325441241264343,
415
+ 0.23329900205135345,
416
+ 0.16339211165905
417
+ ],
418
+ [
419
+ 0.1438535898923874,
420
+ 0.22306153178215027,
421
+ 0.20849788188934326
422
+ ],
423
+ [
424
+ 0.14767497777938843,
425
+ -0.06503646820783615,
426
+ -0.23767957091331482
427
+ ],
428
+ [
429
+ 0.35725727677345276,
430
+ -0.10453184694051743,
431
+ 0.5414588451385498
432
+ ],
433
+ [
434
+ -0.09145285934209824,
435
+ 0.3344604969024658,
436
+ 0.04309385269880295
437
+ ],
438
+ [
439
+ 1.37043035030365,
440
+ -0.7319602370262146,
441
+ 0.42546531558036804
442
+ ],
443
+ [
444
+ -0.08218256384134293,
445
+ 0.9168795943260193,
446
+ -0.34223073720932007
447
+ ],
448
+ [
449
+ 0.15490470826625824,
450
+ 0.5449513792991638,
451
+ 0.010205847211182117
452
+ ],
453
+ [
454
+ 0.033627722412347794,
455
+ 0.7240769863128662,
456
+ 0.21537230908870697
457
+ ],
458
+ [
459
+ -0.13714678585529327,
460
+ -0.11730996519327164,
461
+ -0.053494907915592194
462
+ ],
463
+ [
464
+ 0.8719881176948547,
465
+ 0.9111194610595703,
466
+ 0.49565842747688293
467
+ ],
468
+ [
469
+ 0.5770467519760132,
470
+ 0.28903597593307495,
471
+ 0.9253307580947876
472
+ ],
473
+ [
474
+ -0.7643914222717285,
475
+ -0.3916642367839813,
476
+ 0.2070000171661377
477
+ ],
478
+ [
479
+ 0.7137627601623535,
480
+ -0.3188117742538452,
481
+ -0.7938769459724426
482
+ ],
483
+ [
484
+ 0.2149561196565628,
485
+ -0.8529192805290222,
486
+ -0.840792715549469
487
+ ],
488
+ [
489
+ 0.18863415718078613,
490
+ -0.07081560045480728,
491
+ 0.10497643798589706
492
+ ],
493
+ [
494
+ 0.7639882564544678,
495
+ 0.29181671142578125,
496
+ -0.31540560722351074
497
+ ],
498
+ [
499
+ 0.7296006083488464,
500
+ 0.7830615639686584,
501
+ -0.02962500974535942
502
+ ],
503
+ [
504
+ -0.1286885142326355,
505
+ -0.4659258723258972,
506
+ 0.12040185928344727
507
+ ],
508
+ [
509
+ -0.20754382014274597,
510
+ -0.38997575640678406,
511
+ -0.3076478838920593
512
+ ],
513
+ [
514
+ 0.11249106377363205,
515
+ -0.17241854965686798,
516
+ 0.4551856815814972
517
+ ],
518
+ [
519
+ -0.22096848487854004,
520
+ -0.30131396651268005,
521
+ 0.3322945833206177
522
+ ],
523
+ [
524
+ -0.2885717451572418,
525
+ 0.13564074039459229,
526
+ 0.3038218021392822
527
+ ],
528
+ [
529
+ 0.5470362901687622,
530
+ -0.056911759078502655,
531
+ 0.3336308002471924
532
+ ],
533
+ [
534
+ 0.14821100234985352,
535
+ 1.019626259803772,
536
+ 0.11912097781896591
537
+ ],
538
+ [
539
+ -0.26448413729667664,
540
+ 0.4463828206062317,
541
+ 0.5633707642555237
542
+ ],
543
+ [
544
+ -0.5019347667694092,
545
+ -0.8952313661575317,
546
+ 0.9968082308769226
547
+ ],
548
+ [
549
+ -0.09097509831190109,
550
+ -0.5021821856498718,
551
+ -0.774272084236145
552
+ ],
553
+ [
554
+ 0.7782373428344727,
555
+ -0.8219204545021057,
556
+ -0.6385967135429382
557
+ ],
558
+ [
559
+ -0.4179416298866272,
560
+ 1.0138181447982788,
561
+ 0.9003906846046448
562
+ ],
563
+ [
564
+ -0.7695522904396057,
565
+ 0.4951356053352356,
566
+ 0.27838262915611267
567
+ ],
568
+ [
569
+ 0.1352238953113556,
570
+ -0.09653834253549576,
571
+ -0.5367135405540466
572
+ ],
573
+ [
574
+ 0.4316764771938324,
575
+ -0.15874578058719635,
576
+ -0.5840363502502441
577
+ ],
578
+ [
579
+ -0.8944721221923828,
580
+ -0.6129788756370544,
581
+ 0.398253858089447
582
+ ],
583
+ [
584
+ 0.6038775444030762,
585
+ 1.230621576309204,
586
+ 0.23074018955230713
587
+ ],
588
+ [
589
+ -0.3480087220668793,
590
+ -0.41038715839385986,
591
+ -0.33286935091018677
592
+ ],
593
+ [
594
+ 0.16971904039382935,
595
+ -0.7054998874664307,
596
+ -0.17762042582035065
597
+ ],
598
+ [
599
+ 0.10676750540733337,
600
+ 0.33191439509391785,
601
+ 0.28654158115386963
602
+ ],
603
+ [
604
+ 0.8161095976829529,
605
+ -0.7738041281700134,
606
+ 0.4539409875869751
607
+ ],
608
+ [
609
+ 0.4919089376926422,
610
+ 0.19382242858409882,
611
+ 0.13490013778209686
612
+ ],
613
+ [
614
+ -0.5341103672981262,
615
+ -0.4275410771369934,
616
+ 0.16920465230941772
617
+ ],
618
+ [
619
+ -0.5047714114189148,
620
+ 0.6682338118553162,
621
+ 0.9830099940299988
622
+ ],
623
+ [
624
+ 0.5792328715324402,
625
+ 0.020967308431863785,
626
+ 0.16329899430274963
627
+ ],
628
+ [
629
+ -0.028551798313856125,
630
+ 0.29181671142578125,
631
+ 0.6271538138389587
632
+ ],
633
+ [
634
+ 1.0009040832519531,
635
+ 1.0899860858917236,
636
+ 0.03138141706585884
637
+ ],
638
+ [
639
+ 0.1733202189207077,
640
+ 0.2952180504798889,
641
+ 0.39154383540153503
642
+ ],
643
+ [
644
+ 1.1995248794555664,
645
+ 0.49439093470573425,
646
+ 0.49956780672073364
647
+ ],
648
+ [
649
+ 0.3402847945690155,
650
+ -0.7532049417495728,
651
+ 0.3707257807254791
652
+ ],
653
+ [
654
+ 0.16844721138477325,
655
+ 0.0371837355196476,
656
+ 0.21008865535259247
657
+ ],
658
+ [
659
+ -0.18475691974163055,
660
+ 0.492800772190094,
661
+ 1.1090965270996094
662
+ ],
663
+ [
664
+ 1.37343430519104,
665
+ 0.6074701547622681,
666
+ 0.07295160740613937
667
+ ],
668
+ [
669
+ -0.2716894745826721,
670
+ 0.34719541668891907,
671
+ 0.2974252998828888
672
+ ],
673
+ [
674
+ 0.6517245173454285,
675
+ 0.34517648816108704,
676
+ -0.6991267800331116
677
+ ],
678
+ [
679
+ 1.131462574005127,
680
+ -0.4181934893131256,
681
+ 0.014711310155689716
682
+ ],
683
+ [
684
+ 0.45891329646110535,
685
+ 0.6425840258598328,
686
+ 0.5629575848579407
687
+ ],
688
+ [
689
+ 0.9829176068305969,
690
+ 0.6837454438209534,
691
+ 0.5779233574867249
692
+ ],
693
+ [
694
+ 0.21745261549949646,
695
+ 0.10289150476455688,
696
+ 1.0186656713485718
697
+ ],
698
+ [
699
+ -0.7066150903701782,
700
+ 0.2927456200122833,
701
+ 0.05835003778338432
702
+ ],
703
+ [
704
+ -0.29833900928497314,
705
+ -0.24320939183235168,
706
+ 0.33919867873191833
707
+ ],
708
+ [
709
+ 0.42278122901916504,
710
+ -0.6517714858055115,
711
+ -0.45446234941482544
712
+ ],
713
+ [
714
+ 0.1102355569601059,
715
+ 0.33386605978012085,
716
+ -0.047687217593193054
717
+ ],
718
+ [
719
+ 1.0147796869277954,
720
+ 0.29181671142578125,
721
+ -0.4382956326007843
722
+ ],
723
+ [
724
+ 0.8252743482589722,
725
+ 0.2133755087852478,
726
+ 0.08934953808784485
727
+ ],
728
+ [
729
+ 0.04747254401445389,
730
+ 0.3921492397785187,
731
+ 0.3003828227519989
732
+ ],
733
+ [
734
+ 0.8993004560470581,
735
+ 0.732109785079956,
736
+ 0.5620099902153015
737
+ ],
738
+ [
739
+ 0.969788670539856,
740
+ 0.3594958782196045,
741
+ 0.24025796353816986
742
+ ],
743
+ [
744
+ -0.5418591499328613,
745
+ 0.684735119342804,
746
+ 0.13651178777217865
747
+ ],
748
+ [
749
+ -0.3063417077064514,
750
+ -0.7611579895019531,
751
+ 0.06944245845079422
752
+ ],
753
+ [
754
+ -0.899300754070282,
755
+ 0.0799374058842659,
756
+ -0.4000241458415985
757
+ ],
758
+ [
759
+ 1.050485372543335,
760
+ 1.190054178237915,
761
+ 1.5068362951278687
762
+ ],
763
+ [
764
+ 1.248936653137207,
765
+ 0.29181671142578125,
766
+ -0.10971122980117798
767
+ ],
768
+ [
769
+ -0.13933615386486053,
770
+ -0.24252121150493622,
771
+ 0.36267510056495667
772
+ ],
773
+ [
774
+ -0.36821112036705017,
775
+ 0.19332224130630493,
776
+ 0.5897371768951416
777
+ ],
778
+ [
779
+ 1.270178198814392,
780
+ 1.128736138343811,
781
+ 0.35461029410362244
782
+ ],
783
+ [
784
+ 0.8342562317848206,
785
+ 0.7590351700782776,
786
+ 0.16291548311710358
787
+ ],
788
+ [
789
+ -0.21916896104812622,
790
+ 0.5995560884475708,
791
+ 0.9344035387039185
792
+ ],
793
+ [
794
+ -0.4020785391330719,
795
+ 0.13320356607437134,
796
+ -0.5687698125839233
797
+ ],
798
+ [
799
+ -0.3448737859725952,
800
+ 0.7098224759101868,
801
+ 0.4886254370212555
802
+ ],
803
+ [
804
+ 0.7236321568489075,
805
+ 0.49880942702293396,
806
+ 0.8195453882217407
807
+ ],
808
+ [
809
+ 1.111620545387268,
810
+ 0.29181671142578125,
811
+ -0.4977026879787445
812
+ ],
813
+ [
814
+ 1.1955883502960205,
815
+ 0.09661023318767548,
816
+ 0.27732595801353455
817
+ ],
818
+ [
819
+ 0.47855231165885925,
820
+ 0.3998045027256012,
821
+ 0.520326554775238
822
+ ],
823
+ [
824
+ 0.7978026866912842,
825
+ 1.0429245233535767,
826
+ 1.114219069480896
827
+ ],
828
+ [
829
+ 0.36164867877960205,
830
+ 0.07844629883766174,
831
+ 0.984810471534729
832
+ ],
833
+ [
834
+ -0.1214253231883049,
835
+ 0.017159847542643547,
836
+ 1.3403764963150024
837
+ ],
838
+ [
839
+ 0.27760541439056396,
840
+ -0.9298321008682251,
841
+ 0.4096109867095947
842
+ ],
843
+ [
844
+ -0.004906071349978447,
845
+ 0.2557479739189148,
846
+ -0.4466286301612854
847
+ ],
848
+ [
849
+ -0.3781079053878784,
850
+ -0.9627765417098999,
851
+ 1.4935261011123657
852
+ ],
853
+ [
854
+ 1.2579375505447388,
855
+ 0.29181671142578125,
856
+ -0.3692721724510193
857
+ ],
858
+ [
859
+ 0.7241488695144653,
860
+ -0.08786455541849136,
861
+ 1.2958627939224243
862
+ ],
863
+ [
864
+ 0.7014161348342896,
865
+ -0.09325847774744034,
866
+ -0.25087472796440125
867
+ ],
868
+ [
869
+ 0.4672619700431824,
870
+ 0.010848304256796837,
871
+ -0.12294646352529526
872
+ ],
873
+ [
874
+ 0.005329097621142864,
875
+ 1.0686196088790894,
876
+ -0.4747982323169708
877
+ ],
878
+ [
879
+ 0.6448685526847839,
880
+ 1.1435459852218628,
881
+ -0.4118386507034302
882
+ ],
883
+ [
884
+ 0.06303134560585022,
885
+ 0.7841649055480957,
886
+ 0.20986035466194153
887
+ ],
888
+ [
889
+ 0.5187354683876038,
890
+ -0.7191954851150513,
891
+ -0.2346244901418686
892
+ ],
893
+ [
894
+ 0.5911051630973816,
895
+ -0.5409417748451233,
896
+ -0.9105331301689148
897
+ ],
898
+ [
899
+ -0.030187280848622322,
900
+ 0.29181671142578125,
901
+ 1.038795828819275
902
+ ],
903
+ [
904
+ 0.6818335056304932,
905
+ 0.6027613282203674,
906
+ 1.3147859573364258
907
+ ],
908
+ [
909
+ 0.37735089659690857,
910
+ -0.15784670412540436,
911
+ 0.3256359100341797
912
+ ],
913
+ [
914
+ 1.1971839666366577,
915
+ 0.8749886155128479,
916
+ 0.6059221029281616
917
+ ],
918
+ [
919
+ 0.6457439064979553,
920
+ 0.6699243187904358,
921
+ 0.4733956754207611
922
+ ],
923
+ [
924
+ -0.26511242985725403,
925
+ 0.6644261479377747,
926
+ -0.47634589672088623
927
+ ],
928
+ [
929
+ 0.616598904132843,
930
+ 0.16727499663829803,
931
+ 0.7800426483154297
932
+ ],
933
+ [
934
+ 0.5040062069892883,
935
+ -0.14378352463245392,
936
+ -0.13610804080963135
937
+ ],
938
+ [
939
+ 1.2880492210388184,
940
+ 0.36526361107826233,
941
+ -0.38332849740982056
942
+ ],
943
+ [
944
+ 0.7372151613235474,
945
+ 0.29181671142578125,
946
+ -0.03127037733793259
947
+ ],
948
+ [
949
+ -0.17959535121917725,
950
+ -0.46408042311668396,
951
+ 0.40369734168052673
952
+ ],
953
+ [
954
+ 1.0773673057556152,
955
+ 0.18843935430049896,
956
+ 0.31045636534690857
957
+ ],
958
+ [
959
+ -0.06575462967157364,
960
+ -0.0999893993139267,
961
+ -0.3071534037590027
962
+ ],
963
+ [
964
+ 0.9312586784362793,
965
+ 0.037244897335767746,
966
+ 1.262158751487732
967
+ ],
968
+ [
969
+ 0.5934927463531494,
970
+ -0.11001330614089966,
971
+ 0.4146396219730377
972
+ ],
973
+ [
974
+ 0.8895318508148193,
975
+ 0.33912143111228943,
976
+ 0.441883385181427
977
+ ],
978
+ [
979
+ -0.11742016673088074,
980
+ 0.6659654378890991,
981
+ 1.0389689207077026
982
+ ],
983
+ [
984
+ -0.9391144514083862,
985
+ 1.1299172639846802,
986
+ 0.2904535233974457
987
+ ],
988
+ [
989
+ -0.19635428488254547,
990
+ 0.29181671142578125,
991
+ 1.180428385734558
992
+ ],
993
+ [
994
+ 0.3635255694389343,
995
+ -0.7368738651275635,
996
+ -0.44110631942749023
997
+ ],
998
+ [
999
+ 0.6002898216247559,
1000
+ 0.7115048766136169,
1001
+ 0.4520062804222107
1002
+ ],
1003
+ [
1004
+ 0.5340331792831421,
1005
+ 0.5105125308036804,
1006
+ 0.6853306293487549
1007
+ ],
1008
+ [
1009
+ -0.35422268509864807,
1010
+ -0.383005291223526,
1011
+ -0.494064599275589
1012
+ ],
1013
+ [
1014
+ 1.4476197957992554,
1015
+ -0.1437377631664276,
1016
+ 0.6719149351119995
1017
+ ],
1018
+ [
1019
+ -0.46162307262420654,
1020
+ 0.4841626286506653,
1021
+ -0.11894477158784866
1022
+ ],
1023
+ [
1024
+ 1.2744969129562378,
1025
+ -0.06925814598798752,
1026
+ -0.4621801972389221
1027
+ ],
1028
+ [
1029
+ 0.08298911154270172,
1030
+ -0.1357104778289795,
1031
+ 0.3881130814552307
1032
+ ],
1033
+ [
1034
+ 0.5402959585189819,
1035
+ 0.2570511996746063,
1036
+ -0.08939802646636963
1037
+ ],
1038
+ [
1039
+ -0.6459614038467407,
1040
+ -0.25679051876068115,
1041
+ 0.8747705817222595
1042
+ ],
1043
+ [
1044
+ 0.43076369166374207,
1045
+ -0.5021677613258362,
1046
+ 0.0363541804254055
1047
+ ],
1048
+ [
1049
+ 0.31835153698921204,
1050
+ 0.5368619561195374,
1051
+ 0.7120820879936218
1052
+ ],
1053
+ [
1054
+ 1.3529167175292969,
1055
+ 0.2760588228702545,
1056
+ 0.45914947986602783
1057
+ ],
1058
+ [
1059
+ 1.0444769859313965,
1060
+ 0.8666611909866333,
1061
+ 1.2023478746414185
1062
+ ],
1063
+ [
1064
+ -0.26926925778388977,
1065
+ -0.8990693092346191,
1066
+ -0.19023294746875763
1067
+ ],
1068
+ [
1069
+ -0.1632995754480362,
1070
+ -0.41065558791160583,
1071
+ -0.23155640065670013
1072
+ ],
1073
+ [
1074
+ 0.5899744629859924,
1075
+ -0.6720564365386963,
1076
+ 0.04501838609576225
1077
+ ],
1078
+ [
1079
+ 0.347116082906723,
1080
+ 0.29181671142578125,
1081
+ 0.6329561471939087
1082
+ ],
1083
+ [
1084
+ -0.7669775485992432,
1085
+ 0.48225492238998413,
1086
+ 0.566737174987793
1087
+ ],
1088
+ [
1089
+ -0.1526595950126648,
1090
+ -0.08648448437452316,
1091
+ 0.7179281711578369
1092
+ ],
1093
+ [
1094
+ 0.8701344132423401,
1095
+ -0.12348105758428574,
1096
+ -0.07938604801893234
1097
+ ],
1098
+ [
1099
+ 0.6966987252235413,
1100
+ 0.9275979995727539,
1101
+ 0.018549533560872078
1102
+ ],
1103
+ [
1104
+ 1.3724595308303833,
1105
+ -0.5801642537117004,
1106
+ 0.22728000581264496
1107
+ ],
1108
+ [
1109
+ -0.28334110975265503,
1110
+ -0.7934252619743347,
1111
+ 0.10227243602275848
1112
+ ],
1113
+ [
1114
+ 0.40091943740844727,
1115
+ 0.29383963346481323,
1116
+ 1.2603650093078613
1117
+ ],
1118
+ [
1119
+ 0.2890244126319885,
1120
+ 1.226073980331421,
1121
+ -0.7119418978691101
1122
+ ],
1123
+ [
1124
+ 0.6233953237533569,
1125
+ 1.0033429861068726,
1126
+ 1.032111644744873
1127
+ ],
1128
+ [
1129
+ 0.49351900815963745,
1130
+ 0.8799840211868286,
1131
+ 0.7926546931266785
1132
+ ],
1133
+ [
1134
+ 0.34952425956726074,
1135
+ 0.1354638785123825,
1136
+ -0.4291952848434448
1137
+ ],
1138
+ [
1139
+ 0.2993246018886566,
1140
+ 0.15255890786647797,
1141
+ 0.37319880723953247
1142
+ ],
1143
+ [
1144
+ 0.9639967083930969,
1145
+ 0.1397349089384079,
1146
+ -0.41835474967956543
1147
+ ],
1148
+ [
1149
+ 0.5237535238265991,
1150
+ -0.896891713142395,
1151
+ -1.351672649383545
1152
+ ],
1153
+ [
1154
+ -0.3893134593963623,
1155
+ 0.31328412890434265,
1156
+ -0.08003895729780197
1157
+ ],
1158
+ [
1159
+ 0.8522402048110962,
1160
+ 0.5729881525039673,
1161
+ 0.18530771136283875
1162
+ ],
1163
+ [
1164
+ 1.7133067846298218,
1165
+ 0.47896021604537964,
1166
+ 0.16805309057235718
1167
+ ],
1168
+ [
1169
+ 0.4199868142604828,
1170
+ 0.7454566359519958,
1171
+ 0.15968471765518188
1172
+ ],
1173
+ [
1174
+ -0.5626803040504456,
1175
+ -0.32539764046669006,
1176
+ 1.6168047189712524
1177
+ ],
1178
+ [
1179
+ 0.4851708710193634,
1180
+ 0.08485821634531021,
1181
+ 0.552808403968811
1182
+ ],
1183
+ [
1184
+ 0.3033086359500885,
1185
+ 0.3137320876121521,
1186
+ 0.23798324167728424
1187
+ ],
1188
+ [
1189
+ 0.5072812438011169,
1190
+ -0.7742474675178528,
1191
+ 0.20723572373390198
1192
+ ],
1193
+ [
1194
+ 1.5272622108459473,
1195
+ -0.857026994228363,
1196
+ -0.3885817527770996
1197
+ ],
1198
+ [
1199
+ -0.8498314619064331,
1200
+ -1.0903066396713257,
1201
+ 1.107202410697937
1202
+ ],
1203
+ [
1204
+ -0.9824524521827698,
1205
+ -0.5545493364334106,
1206
+ 0.6659228801727295
1207
+ ],
1208
+ [
1209
+ 1.5147898197174072,
1210
+ 0.945969820022583,
1211
+ 0.7589691281318665
1212
+ ],
1213
+ [
1214
+ 0.6783913969993591,
1215
+ 0.29181671142578125,
1216
+ 0.5737981200218201
1217
+ ],
1218
+ [
1219
+ 0.2241208702325821,
1220
+ 0.13132338225841522,
1221
+ 1.100775122642517
1222
+ ],
1223
+ [
1224
+ 0.26957446336746216,
1225
+ -0.4629915654659271,
1226
+ -0.8701563477516174
1227
+ ],
1228
+ [
1229
+ -0.49690186977386475,
1230
+ 0.5250351428985596,
1231
+ -0.4097417891025543
1232
+ ],
1233
+ [
1234
+ 0.5711739659309387,
1235
+ -0.8684530854225159,
1236
+ 0.45744916796684265
1237
+ ],
1238
+ [
1239
+ 1.0462424755096436,
1240
+ -0.7023312449455261,
1241
+ -0.19748517870903015
1242
+ ],
1243
+ [
1244
+ 0.030918758362531662,
1245
+ 0.7856431603431702,
1246
+ -0.3355279564857483
1247
+ ],
1248
+ [
1249
+ -0.008804546669125557,
1250
+ -0.8827769756317139,
1251
+ -0.4400416314601898
1252
+ ],
1253
+ [
1254
+ 1.140304446220398,
1255
+ 0.5504470467567444,
1256
+ 1.6465587615966797
1257
+ ],
1258
+ [
1259
+ -0.3450998365879059,
1260
+ 0.29181671142578125,
1261
+ 0.30365073680877686
1262
+ ],
1263
+ [
1264
+ 0.9964448809623718,
1265
+ -0.62396639585495,
1266
+ 0.5908036828041077
1267
+ ],
1268
+ [
1269
+ 0.14356864988803864,
1270
+ -0.23120446503162384,
1271
+ 0.48505157232284546
1272
+ ],
1273
+ [
1274
+ 0.07708197087049484,
1275
+ -0.05111119896173477,
1276
+ -0.20051567256450653
1277
+ ],
1278
+ [
1279
+ 1.3595986366271973,
1280
+ -1.1465976238250732,
1281
+ -0.2699911892414093
1282
+ ],
1283
+ [
1284
+ 0.4273603558540344,
1285
+ -0.44986778497695923,
1286
+ -0.4925510883331299
1287
+ ],
1288
+ [
1289
+ 1.0969822406768799,
1290
+ -1.0698773860931396,
1291
+ -0.3923966884613037
1292
+ ],
1293
+ [
1294
+ -0.8404163122177124,
1295
+ 1.4395596981048584,
1296
+ -0.19648201763629913
1297
+ ],
1298
+ [
1299
+ 1.3057408332824707,
1300
+ 0.27632415294647217,
1301
+ 0.6152714490890503
1302
+ ],
1303
+ [
1304
+ -0.17780931293964386,
1305
+ 0.43932393193244934,
1306
+ 0.6458268165588379
1307
+ ],
1308
+ [
1309
+ -0.528653085231781,
1310
+ -0.14353175461292267,
1311
+ 0.9229506254196167
1312
+ ],
1313
+ [
1314
+ 1.4555248022079468,
1315
+ -1.0259952545166016,
1316
+ -0.4562441408634186
1317
+ ],
1318
+ [
1319
+ -0.14756165444850922,
1320
+ -0.5561291575431824,
1321
+ 1.0768119096755981
1322
+ ],
1323
+ [
1324
+ -0.41391754150390625,
1325
+ -1.1037604808807373,
1326
+ 1.2041736841201782
1327
+ ],
1328
+ [
1329
+ 1.550291657447815,
1330
+ -1.0872933864593506,
1331
+ 0.7051491141319275
1332
+ ],
1333
+ [
1334
+ -0.04375248774886131,
1335
+ -0.4118551015853882,
1336
+ -0.36378324031829834
1337
+ ],
1338
+ [
1339
+ 0.05724472925066948,
1340
+ -0.18368586897850037,
1341
+ 0.1453598439693451
1342
+ ],
1343
+ [
1344
+ 1.626542091369629,
1345
+ -0.4099309742450714,
1346
+ 1.0515543222427368
1347
+ ],
1348
+ [
1349
+ -0.6195567846298218,
1350
+ 0.29181671142578125,
1351
+ 1.736567735671997
1352
+ ],
1353
+ [
1354
+ -0.12013274431228638,
1355
+ -0.9047581553459167,
1356
+ 0.5967634320259094
1357
+ ],
1358
+ [
1359
+ -0.39893123507499695,
1360
+ 0.6357282996177673,
1361
+ -0.6430260539054871
1362
+ ],
1363
+ [
1364
+ -0.08661642670631409,
1365
+ 0.6548808813095093,
1366
+ -0.41377395391464233
1367
+ ],
1368
+ [
1369
+ 1.4145830869674683,
1370
+ -0.19026178121566772,
1371
+ 0.8982827067375183
1372
+ ],
1373
+ [
1374
+ 1.058542013168335,
1375
+ 1.0563756227493286,
1376
+ 0.9325186610221863
1377
+ ],
1378
+ [
1379
+ 1.911826252937317,
1380
+ 0.45161741971969604,
1381
+ -0.18585841357707977
1382
+ ],
1383
+ [
1384
+ -0.332288920879364,
1385
+ 0.39983323216438293,
1386
+ -0.491976261138916
1387
+ ],
1388
+ [
1389
+ 1.5136407613754272,
1390
+ -0.2972322404384613,
1391
+ 0.957003116607666
1392
+ ],
1393
+ [
1394
+ -0.47592005133628845,
1395
+ 0.2637055218219757,
1396
+ 1.2554272413253784
1397
+ ],
1398
+ [
1399
+ -0.608280599117279,
1400
+ 0.31998905539512634,
1401
+ 0.8458261489868164
1402
+ ],
1403
+ [
1404
+ -0.911689281463623,
1405
+ 0.717431366443634,
1406
+ 0.3810383677482605
1407
+ ],
1408
+ [
1409
+ 1.7994369268417358,
1410
+ 0.760383665561676,
1411
+ 0.7605606913566589
1412
+ ],
1413
+ [
1414
+ 0.5412089228630066,
1415
+ -1.1707699298858643,
1416
+ -0.17998889088630676
1417
+ ],
1418
+ [
1419
+ 0.8193005919456482,
1420
+ 0.03770531341433525,
1421
+ -0.6201762557029724
1422
+ ],
1423
+ [
1424
+ -1.4711514711380005,
1425
+ 0.643258273601532,
1426
+ 1.6248557567596436
1427
+ ],
1428
+ [
1429
+ -0.4884009063243866,
1430
+ 0.1881939172744751,
1431
+ 1.156332015991211
1432
+ ],
1433
+ [
1434
+ -0.8440516591072083,
1435
+ 1.5180282592773438,
1436
+ 1.0642322301864624
1437
+ ],
1438
+ [
1439
+ -0.992385983467102,
1440
+ 0.29181671142578125,
1441
+ -0.8700374960899353
1442
+ ],
1443
+ [
1444
+ 2.151193857192993,
1445
+ 0.29181671142578125,
1446
+ -1.2509030103683472
1447
+ ],
1448
+ [
1449
+ 2.0959925651550293,
1450
+ -0.8486384153366089,
1451
+ -1.6925979852676392
1452
+ ],
1453
+ [
1454
+ -1.304608941078186,
1455
+ 1.8554006814956665,
1456
+ -0.2182377427816391
1457
+ ]
1458
+ ],
1459
+ "model_names": [
1460
+ "instruct",
1461
+ "math",
1462
+ "code"
1463
+ ],
1464
+ "num_models": 3,
1465
+ "num_params": 291,
1466
+ "param_names": [
1467
+ "model.embed_tokens.weight",
1468
+ "model.layers.0.self_attn.q_proj.weight",
1469
+ "model.layers.0.self_attn.k_proj.weight",
1470
+ "model.layers.0.self_attn.v_proj.weight",
1471
+ "model.layers.0.self_attn.o_proj.weight",
1472
+ "model.layers.0.mlp.gate_proj.weight",
1473
+ "model.layers.0.mlp.up_proj.weight",
1474
+ "model.layers.0.mlp.down_proj.weight",
1475
+ "model.layers.0.input_layernorm.weight",
1476
+ "model.layers.0.post_attention_layernorm.weight",
1477
+ "model.layers.1.self_attn.q_proj.weight",
1478
+ "model.layers.1.self_attn.k_proj.weight",
1479
+ "model.layers.1.self_attn.v_proj.weight",
1480
+ "model.layers.1.self_attn.o_proj.weight",
1481
+ "model.layers.1.mlp.gate_proj.weight",
1482
+ "model.layers.1.mlp.up_proj.weight",
1483
+ "model.layers.1.mlp.down_proj.weight",
1484
+ "model.layers.1.input_layernorm.weight",
1485
+ "model.layers.1.post_attention_layernorm.weight",
1486
+ "model.layers.2.self_attn.q_proj.weight",
1487
+ "model.layers.2.self_attn.k_proj.weight",
1488
+ "model.layers.2.self_attn.v_proj.weight",
1489
+ "model.layers.2.self_attn.o_proj.weight",
1490
+ "model.layers.2.mlp.gate_proj.weight",
1491
+ "model.layers.2.mlp.up_proj.weight",
1492
+ "model.layers.2.mlp.down_proj.weight",
1493
+ "model.layers.2.input_layernorm.weight",
1494
+ "model.layers.2.post_attention_layernorm.weight",
1495
+ "model.layers.3.self_attn.q_proj.weight",
1496
+ "model.layers.3.self_attn.k_proj.weight",
1497
+ "model.layers.3.self_attn.v_proj.weight",
1498
+ "model.layers.3.self_attn.o_proj.weight",
1499
+ "model.layers.3.mlp.gate_proj.weight",
1500
+ "model.layers.3.mlp.up_proj.weight",
1501
+ "model.layers.3.mlp.down_proj.weight",
1502
+ "model.layers.3.input_layernorm.weight",
1503
+ "model.layers.3.post_attention_layernorm.weight",
1504
+ "model.layers.4.self_attn.q_proj.weight",
1505
+ "model.layers.4.self_attn.k_proj.weight",
1506
+ "model.layers.4.self_attn.v_proj.weight",
1507
+ "model.layers.4.self_attn.o_proj.weight",
1508
+ "model.layers.4.mlp.gate_proj.weight",
1509
+ "model.layers.4.mlp.up_proj.weight",
1510
+ "model.layers.4.mlp.down_proj.weight",
1511
+ "model.layers.4.input_layernorm.weight",
1512
+ "model.layers.4.post_attention_layernorm.weight",
1513
+ "model.layers.5.self_attn.q_proj.weight",
1514
+ "model.layers.5.self_attn.k_proj.weight",
1515
+ "model.layers.5.self_attn.v_proj.weight",
1516
+ "model.layers.5.self_attn.o_proj.weight",
1517
+ "model.layers.5.mlp.gate_proj.weight",
1518
+ "model.layers.5.mlp.up_proj.weight",
1519
+ "model.layers.5.mlp.down_proj.weight",
1520
+ "model.layers.5.input_layernorm.weight",
1521
+ "model.layers.5.post_attention_layernorm.weight",
1522
+ "model.layers.6.self_attn.q_proj.weight",
1523
+ "model.layers.6.self_attn.k_proj.weight",
1524
+ "model.layers.6.self_attn.v_proj.weight",
1525
+ "model.layers.6.self_attn.o_proj.weight",
1526
+ "model.layers.6.mlp.gate_proj.weight",
1527
+ "model.layers.6.mlp.up_proj.weight",
1528
+ "model.layers.6.mlp.down_proj.weight",
1529
+ "model.layers.6.input_layernorm.weight",
1530
+ "model.layers.6.post_attention_layernorm.weight",
1531
+ "model.layers.7.self_attn.q_proj.weight",
1532
+ "model.layers.7.self_attn.k_proj.weight",
1533
+ "model.layers.7.self_attn.v_proj.weight",
1534
+ "model.layers.7.self_attn.o_proj.weight",
1535
+ "model.layers.7.mlp.gate_proj.weight",
1536
+ "model.layers.7.mlp.up_proj.weight",
1537
+ "model.layers.7.mlp.down_proj.weight",
1538
+ "model.layers.7.input_layernorm.weight",
1539
+ "model.layers.7.post_attention_layernorm.weight",
1540
+ "model.layers.8.self_attn.q_proj.weight",
1541
+ "model.layers.8.self_attn.k_proj.weight",
1542
+ "model.layers.8.self_attn.v_proj.weight",
1543
+ "model.layers.8.self_attn.o_proj.weight",
1544
+ "model.layers.8.mlp.gate_proj.weight",
1545
+ "model.layers.8.mlp.up_proj.weight",
1546
+ "model.layers.8.mlp.down_proj.weight",
1547
+ "model.layers.8.input_layernorm.weight",
1548
+ "model.layers.8.post_attention_layernorm.weight",
1549
+ "model.layers.9.self_attn.q_proj.weight",
1550
+ "model.layers.9.self_attn.k_proj.weight",
1551
+ "model.layers.9.self_attn.v_proj.weight",
1552
+ "model.layers.9.self_attn.o_proj.weight",
1553
+ "model.layers.9.mlp.gate_proj.weight",
1554
+ "model.layers.9.mlp.up_proj.weight",
1555
+ "model.layers.9.mlp.down_proj.weight",
1556
+ "model.layers.9.input_layernorm.weight",
1557
+ "model.layers.9.post_attention_layernorm.weight",
1558
+ "model.layers.10.self_attn.q_proj.weight",
1559
+ "model.layers.10.self_attn.k_proj.weight",
1560
+ "model.layers.10.self_attn.v_proj.weight",
1561
+ "model.layers.10.self_attn.o_proj.weight",
1562
+ "model.layers.10.mlp.gate_proj.weight",
1563
+ "model.layers.10.mlp.up_proj.weight",
1564
+ "model.layers.10.mlp.down_proj.weight",
1565
+ "model.layers.10.input_layernorm.weight",
1566
+ "model.layers.10.post_attention_layernorm.weight",
1567
+ "model.layers.11.self_attn.q_proj.weight",
1568
+ "model.layers.11.self_attn.k_proj.weight",
1569
+ "model.layers.11.self_attn.v_proj.weight",
1570
+ "model.layers.11.self_attn.o_proj.weight",
1571
+ "model.layers.11.mlp.gate_proj.weight",
1572
+ "model.layers.11.mlp.up_proj.weight",
1573
+ "model.layers.11.mlp.down_proj.weight",
1574
+ "model.layers.11.input_layernorm.weight",
1575
+ "model.layers.11.post_attention_layernorm.weight",
1576
+ "model.layers.12.self_attn.q_proj.weight",
1577
+ "model.layers.12.self_attn.k_proj.weight",
1578
+ "model.layers.12.self_attn.v_proj.weight",
1579
+ "model.layers.12.self_attn.o_proj.weight",
1580
+ "model.layers.12.mlp.gate_proj.weight",
1581
+ "model.layers.12.mlp.up_proj.weight",
1582
+ "model.layers.12.mlp.down_proj.weight",
1583
+ "model.layers.12.input_layernorm.weight",
1584
+ "model.layers.12.post_attention_layernorm.weight",
1585
+ "model.layers.13.self_attn.q_proj.weight",
1586
+ "model.layers.13.self_attn.k_proj.weight",
1587
+ "model.layers.13.self_attn.v_proj.weight",
1588
+ "model.layers.13.self_attn.o_proj.weight",
1589
+ "model.layers.13.mlp.gate_proj.weight",
1590
+ "model.layers.13.mlp.up_proj.weight",
1591
+ "model.layers.13.mlp.down_proj.weight",
1592
+ "model.layers.13.input_layernorm.weight",
1593
+ "model.layers.13.post_attention_layernorm.weight",
1594
+ "model.layers.14.self_attn.q_proj.weight",
1595
+ "model.layers.14.self_attn.k_proj.weight",
1596
+ "model.layers.14.self_attn.v_proj.weight",
1597
+ "model.layers.14.self_attn.o_proj.weight",
1598
+ "model.layers.14.mlp.gate_proj.weight",
1599
+ "model.layers.14.mlp.up_proj.weight",
1600
+ "model.layers.14.mlp.down_proj.weight",
1601
+ "model.layers.14.input_layernorm.weight",
1602
+ "model.layers.14.post_attention_layernorm.weight",
1603
+ "model.layers.15.self_attn.q_proj.weight",
1604
+ "model.layers.15.self_attn.k_proj.weight",
1605
+ "model.layers.15.self_attn.v_proj.weight",
1606
+ "model.layers.15.self_attn.o_proj.weight",
1607
+ "model.layers.15.mlp.gate_proj.weight",
1608
+ "model.layers.15.mlp.up_proj.weight",
1609
+ "model.layers.15.mlp.down_proj.weight",
1610
+ "model.layers.15.input_layernorm.weight",
1611
+ "model.layers.15.post_attention_layernorm.weight",
1612
+ "model.layers.16.self_attn.q_proj.weight",
1613
+ "model.layers.16.self_attn.k_proj.weight",
1614
+ "model.layers.16.self_attn.v_proj.weight",
1615
+ "model.layers.16.self_attn.o_proj.weight",
1616
+ "model.layers.16.mlp.gate_proj.weight",
1617
+ "model.layers.16.mlp.up_proj.weight",
1618
+ "model.layers.16.mlp.down_proj.weight",
1619
+ "model.layers.16.input_layernorm.weight",
1620
+ "model.layers.16.post_attention_layernorm.weight",
1621
+ "model.layers.17.self_attn.q_proj.weight",
1622
+ "model.layers.17.self_attn.k_proj.weight",
1623
+ "model.layers.17.self_attn.v_proj.weight",
1624
+ "model.layers.17.self_attn.o_proj.weight",
1625
+ "model.layers.17.mlp.gate_proj.weight",
1626
+ "model.layers.17.mlp.up_proj.weight",
1627
+ "model.layers.17.mlp.down_proj.weight",
1628
+ "model.layers.17.input_layernorm.weight",
1629
+ "model.layers.17.post_attention_layernorm.weight",
1630
+ "model.layers.18.self_attn.q_proj.weight",
1631
+ "model.layers.18.self_attn.k_proj.weight",
1632
+ "model.layers.18.self_attn.v_proj.weight",
1633
+ "model.layers.18.self_attn.o_proj.weight",
1634
+ "model.layers.18.mlp.gate_proj.weight",
1635
+ "model.layers.18.mlp.up_proj.weight",
1636
+ "model.layers.18.mlp.down_proj.weight",
1637
+ "model.layers.18.input_layernorm.weight",
1638
+ "model.layers.18.post_attention_layernorm.weight",
1639
+ "model.layers.19.self_attn.q_proj.weight",
1640
+ "model.layers.19.self_attn.k_proj.weight",
1641
+ "model.layers.19.self_attn.v_proj.weight",
1642
+ "model.layers.19.self_attn.o_proj.weight",
1643
+ "model.layers.19.mlp.gate_proj.weight",
1644
+ "model.layers.19.mlp.up_proj.weight",
1645
+ "model.layers.19.mlp.down_proj.weight",
1646
+ "model.layers.19.input_layernorm.weight",
1647
+ "model.layers.19.post_attention_layernorm.weight",
1648
+ "model.layers.20.self_attn.q_proj.weight",
1649
+ "model.layers.20.self_attn.k_proj.weight",
1650
+ "model.layers.20.self_attn.v_proj.weight",
1651
+ "model.layers.20.self_attn.o_proj.weight",
1652
+ "model.layers.20.mlp.gate_proj.weight",
1653
+ "model.layers.20.mlp.up_proj.weight",
1654
+ "model.layers.20.mlp.down_proj.weight",
1655
+ "model.layers.20.input_layernorm.weight",
1656
+ "model.layers.20.post_attention_layernorm.weight",
1657
+ "model.layers.21.self_attn.q_proj.weight",
1658
+ "model.layers.21.self_attn.k_proj.weight",
1659
+ "model.layers.21.self_attn.v_proj.weight",
1660
+ "model.layers.21.self_attn.o_proj.weight",
1661
+ "model.layers.21.mlp.gate_proj.weight",
1662
+ "model.layers.21.mlp.up_proj.weight",
1663
+ "model.layers.21.mlp.down_proj.weight",
1664
+ "model.layers.21.input_layernorm.weight",
1665
+ "model.layers.21.post_attention_layernorm.weight",
1666
+ "model.layers.22.self_attn.q_proj.weight",
1667
+ "model.layers.22.self_attn.k_proj.weight",
1668
+ "model.layers.22.self_attn.v_proj.weight",
1669
+ "model.layers.22.self_attn.o_proj.weight",
1670
+ "model.layers.22.mlp.gate_proj.weight",
1671
+ "model.layers.22.mlp.up_proj.weight",
1672
+ "model.layers.22.mlp.down_proj.weight",
1673
+ "model.layers.22.input_layernorm.weight",
1674
+ "model.layers.22.post_attention_layernorm.weight",
1675
+ "model.layers.23.self_attn.q_proj.weight",
1676
+ "model.layers.23.self_attn.k_proj.weight",
1677
+ "model.layers.23.self_attn.v_proj.weight",
1678
+ "model.layers.23.self_attn.o_proj.weight",
1679
+ "model.layers.23.mlp.gate_proj.weight",
1680
+ "model.layers.23.mlp.up_proj.weight",
1681
+ "model.layers.23.mlp.down_proj.weight",
1682
+ "model.layers.23.input_layernorm.weight",
1683
+ "model.layers.23.post_attention_layernorm.weight",
1684
+ "model.layers.24.self_attn.q_proj.weight",
1685
+ "model.layers.24.self_attn.k_proj.weight",
1686
+ "model.layers.24.self_attn.v_proj.weight",
1687
+ "model.layers.24.self_attn.o_proj.weight",
1688
+ "model.layers.24.mlp.gate_proj.weight",
1689
+ "model.layers.24.mlp.up_proj.weight",
1690
+ "model.layers.24.mlp.down_proj.weight",
1691
+ "model.layers.24.input_layernorm.weight",
1692
+ "model.layers.24.post_attention_layernorm.weight",
1693
+ "model.layers.25.self_attn.q_proj.weight",
1694
+ "model.layers.25.self_attn.k_proj.weight",
1695
+ "model.layers.25.self_attn.v_proj.weight",
1696
+ "model.layers.25.self_attn.o_proj.weight",
1697
+ "model.layers.25.mlp.gate_proj.weight",
1698
+ "model.layers.25.mlp.up_proj.weight",
1699
+ "model.layers.25.mlp.down_proj.weight",
1700
+ "model.layers.25.input_layernorm.weight",
1701
+ "model.layers.25.post_attention_layernorm.weight",
1702
+ "model.layers.26.self_attn.q_proj.weight",
1703
+ "model.layers.26.self_attn.k_proj.weight",
1704
+ "model.layers.26.self_attn.v_proj.weight",
1705
+ "model.layers.26.self_attn.o_proj.weight",
1706
+ "model.layers.26.mlp.gate_proj.weight",
1707
+ "model.layers.26.mlp.up_proj.weight",
1708
+ "model.layers.26.mlp.down_proj.weight",
1709
+ "model.layers.26.input_layernorm.weight",
1710
+ "model.layers.26.post_attention_layernorm.weight",
1711
+ "model.layers.27.self_attn.q_proj.weight",
1712
+ "model.layers.27.self_attn.k_proj.weight",
1713
+ "model.layers.27.self_attn.v_proj.weight",
1714
+ "model.layers.27.self_attn.o_proj.weight",
1715
+ "model.layers.27.mlp.gate_proj.weight",
1716
+ "model.layers.27.mlp.up_proj.weight",
1717
+ "model.layers.27.mlp.down_proj.weight",
1718
+ "model.layers.27.input_layernorm.weight",
1719
+ "model.layers.27.post_attention_layernorm.weight",
1720
+ "model.layers.28.self_attn.q_proj.weight",
1721
+ "model.layers.28.self_attn.k_proj.weight",
1722
+ "model.layers.28.self_attn.v_proj.weight",
1723
+ "model.layers.28.self_attn.o_proj.weight",
1724
+ "model.layers.28.mlp.gate_proj.weight",
1725
+ "model.layers.28.mlp.up_proj.weight",
1726
+ "model.layers.28.mlp.down_proj.weight",
1727
+ "model.layers.28.input_layernorm.weight",
1728
+ "model.layers.28.post_attention_layernorm.weight",
1729
+ "model.layers.29.self_attn.q_proj.weight",
1730
+ "model.layers.29.self_attn.k_proj.weight",
1731
+ "model.layers.29.self_attn.v_proj.weight",
1732
+ "model.layers.29.self_attn.o_proj.weight",
1733
+ "model.layers.29.mlp.gate_proj.weight",
1734
+ "model.layers.29.mlp.up_proj.weight",
1735
+ "model.layers.29.mlp.down_proj.weight",
1736
+ "model.layers.29.input_layernorm.weight",
1737
+ "model.layers.29.post_attention_layernorm.weight",
1738
+ "model.layers.30.self_attn.q_proj.weight",
1739
+ "model.layers.30.self_attn.k_proj.weight",
1740
+ "model.layers.30.self_attn.v_proj.weight",
1741
+ "model.layers.30.self_attn.o_proj.weight",
1742
+ "model.layers.30.mlp.gate_proj.weight",
1743
+ "model.layers.30.mlp.up_proj.weight",
1744
+ "model.layers.30.mlp.down_proj.weight",
1745
+ "model.layers.30.input_layernorm.weight",
1746
+ "model.layers.30.post_attention_layernorm.weight",
1747
+ "model.layers.31.self_attn.q_proj.weight",
1748
+ "model.layers.31.self_attn.k_proj.weight",
1749
+ "model.layers.31.self_attn.v_proj.weight",
1750
+ "model.layers.31.self_attn.o_proj.weight",
1751
+ "model.layers.31.mlp.gate_proj.weight",
1752
+ "model.layers.31.mlp.up_proj.weight",
1753
+ "model.layers.31.mlp.down_proj.weight",
1754
+ "model.layers.31.input_layernorm.weight",
1755
+ "model.layers.31.post_attention_layernorm.weight",
1756
+ "model.norm.weight",
1757
+ "lm_head.weight"
1758
+ ]
1759
+ }
logs/save_merged_model_20250703_174232.log ADDED
@@ -0,0 +1,74 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Starting merged model save process
2
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Arguments: {'lambdas_path': '/work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/test-time-adaptation-self_certainty-lr0.01-ep1/llm_adamerge_parameterwise_lambdas.json', 'model_config': '/work/gj26/b20042/LLM-AdaMerge/src/configs/model_config.yaml', 'output_dir': '/work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-time-adaptation-self_certainty/ep1-lr0.01', 'model_name': 'merged-model', 'push_to_hub': False, 'hub_repo_id': 'lejelly/test-time-adaptation-self-certainty-ep1-lr001-llm-adamerge-mistral-7b-instrcut-math-code', 'private': False, 'device': 'cuda', 'debug': False}
3
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Loading lambdas from /work/gj26/b20042/LLM-AdaMerge/outputs/mistral-7b/parameter-wise/test-time-adaptation-self_certainty-lr0.01-ep1/llm_adamerge_parameterwise_lambdas.json
4
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Auto-detected parameter-wise merge from JSON structure
5
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Merge type: parameter_wise
6
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - [Initial] Memory Usage:
7
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Process: 0.47 GB (0.2%)
8
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - System: 7.89 GB / 212.49 GB (8.3%)
9
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Available: 194.78 GB
10
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
11
+ 2025-07-03 17:42:32 - experiment_save_merged_model - INFO - Loading models
12
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - [After loading models] Memory Usage:
13
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - Process: 41.58 GB (19.6%)
14
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - System: 49.11 GB / 212.49 GB (31.0%)
15
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - Available: 146.55 GB
16
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
17
+ 2025-07-03 17:42:50 - experiment_save_merged_model - INFO - Initializing parameter_wise AdaMerge
18
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Loading learned lambdas
19
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Deleting original models to free memory (task vectors already computed)
20
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - [Before deleting models] Memory Usage:
21
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Process: 95.77 GB (45.1%)
22
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - System: 90.34 GB / 212.49 GB (50.5%)
23
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Available: 105.25 GB
24
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
25
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Clearing model_loader references
26
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Deleting model variables
27
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Running garbage collection
28
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - [After deleting models and GC] Memory Usage:
29
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Process: 56.38 GB (26.5%)
30
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - System: 64.91 GB / 212.49 GB (38.5%)
31
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Available: 130.69 GB
32
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
33
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - [After loading lambdas] Memory Usage:
34
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Process: 56.38 GB (26.5%)
35
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - System: 64.91 GB / 212.49 GB (38.5%)
36
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Available: 130.69 GB
37
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
38
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Creating merged model with learned lambdas
39
+ 2025-07-03 17:44:05 - experiment_save_merged_model - INFO - Using merge_models_for_save()
40
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - [After merging models] Memory Usage:
41
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Process: 58.07 GB (27.3%)
42
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - System: 93.91 GB / 212.49 GB (49.0%)
43
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Available: 108.40 GB
44
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 27.23 GB, Total: 94.50 GB
45
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Freeing memory from AdaMerge object (task vectors and base params no longer needed)
46
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Deleting task vectors
47
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Deleting base params
48
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Deleting functional model
49
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - [After freeing AdaMerge memory] Memory Usage:
50
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Process: 6.08 GB (2.9%)
51
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - System: 27.73 GB / 212.49 GB (17.8%)
52
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Available: 174.58 GB
53
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - GPU 0: Allocated: 13.49 GB, Reserved: 13.62 GB, Total: 94.50 GB
54
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Saving merged model to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-time-adaptation-self_certainty/ep1-lr0.01
55
+ 2025-07-03 17:46:06 - experiment_save_merged_model - INFO - Moving merged model to CPU for saving
56
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Successfully saved 3 safetensors files:
57
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - - model-00001-of-00003.safetensors (4714.17 MB)
58
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - - model-00003-of-00003.safetensors (4330.17 MB)
59
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - - model-00002-of-00003.safetensors (4768.20 MB)
60
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - [After saving model] Memory Usage:
61
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Process: 15.33 GB (7.2%)
62
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - System: 23.37 GB / 212.49 GB (18.9%)
63
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Available: 172.24 GB
64
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
65
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Saving tokenizer
66
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Copied lambdas file to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-time-adaptation-self_certainty/ep1-lr0.01/learned_lambdas.json
67
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Creating model card
68
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Cleaning up models
69
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - [After cleanup] Memory Usage:
70
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Process: 15.33 GB (7.2%)
71
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - System: 23.37 GB / 212.49 GB (18.9%)
72
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Available: 172.24 GB
73
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - GPU 0: Allocated: 0.00 GB, Reserved: 0.00 GB, Total: 94.50 GB
74
+ 2025-07-03 17:46:46 - experiment_save_merged_model - INFO - Model saved successfully to /work/gj26/b20042/LLM-AdaMerge/mergekit/outputs/mistral-7b/llmadamerge/test-time-adaptation-self_certainty/ep1-lr0.01
model-00001-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a0ddc16686c449aa23e78da65ee75a7d17b70c8c6190feef17412a55b557792a
3
+ size 4943162240
model-00002-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:77845dcd7fff8cb06f637402ba9b7a86393bf4b52ba10db67db482b156c24191
3
+ size 4999819232
model-00003-of-00003.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3954aac4b5141c80b398569a26fbf6c4133a9054dbdeb3dacfbfaad3c3bafe0e
3
+ size 4540516256
model.safetensors.index.json ADDED
@@ -0,0 +1,298 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "metadata": {
3
+ "total_size": 14483464192
4
+ },
5
+ "weight_map": {
6
+ "lm_head.weight": "model-00003-of-00003.safetensors",
7
+ "model.embed_tokens.weight": "model-00001-of-00003.safetensors",
8
+ "model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors",
9
+ "model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
10
+ "model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
11
+ "model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
12
+ "model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
13
+ "model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
14
+ "model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
15
+ "model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
16
+ "model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
17
+ "model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors",
18
+ "model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
19
+ "model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
20
+ "model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
21
+ "model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
22
+ "model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
23
+ "model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
24
+ "model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
25
+ "model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
26
+ "model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors",
27
+ "model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
28
+ "model.layers.10.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
29
+ "model.layers.10.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
30
+ "model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
31
+ "model.layers.10.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
32
+ "model.layers.10.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
33
+ "model.layers.10.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
34
+ "model.layers.10.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
35
+ "model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors",
36
+ "model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
37
+ "model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
38
+ "model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
39
+ "model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
40
+ "model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
41
+ "model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
42
+ "model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
43
+ "model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
44
+ "model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors",
45
+ "model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
46
+ "model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
47
+ "model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
48
+ "model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
49
+ "model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
50
+ "model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
51
+ "model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
52
+ "model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
53
+ "model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors",
54
+ "model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
55
+ "model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
56
+ "model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
57
+ "model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
58
+ "model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
59
+ "model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
60
+ "model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
61
+ "model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
62
+ "model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors",
63
+ "model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
64
+ "model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
65
+ "model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
66
+ "model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
67
+ "model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
68
+ "model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
69
+ "model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
70
+ "model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
71
+ "model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors",
72
+ "model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
73
+ "model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
74
+ "model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
75
+ "model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
76
+ "model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
77
+ "model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
78
+ "model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
79
+ "model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
80
+ "model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors",
81
+ "model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
82
+ "model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
83
+ "model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
84
+ "model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
85
+ "model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
86
+ "model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
87
+ "model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
88
+ "model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
89
+ "model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors",
90
+ "model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
91
+ "model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
92
+ "model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
93
+ "model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
94
+ "model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
95
+ "model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
96
+ "model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
97
+ "model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
98
+ "model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors",
99
+ "model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
100
+ "model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
101
+ "model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
102
+ "model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
103
+ "model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
104
+ "model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
105
+ "model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
106
+ "model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
107
+ "model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors",
108
+ "model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
109
+ "model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
110
+ "model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
111
+ "model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
112
+ "model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
113
+ "model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
114
+ "model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
115
+ "model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
116
+ "model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors",
117
+ "model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
118
+ "model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
119
+ "model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
120
+ "model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
121
+ "model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
122
+ "model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
123
+ "model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
124
+ "model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
125
+ "model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors",
126
+ "model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
127
+ "model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
128
+ "model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
129
+ "model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
130
+ "model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
131
+ "model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
132
+ "model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
133
+ "model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
134
+ "model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors",
135
+ "model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors",
136
+ "model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors",
137
+ "model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors",
138
+ "model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors",
139
+ "model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
140
+ "model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
141
+ "model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
142
+ "model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
143
+ "model.layers.22.input_layernorm.weight": "model-00003-of-00003.safetensors",
144
+ "model.layers.22.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
145
+ "model.layers.22.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
146
+ "model.layers.22.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
147
+ "model.layers.22.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
148
+ "model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors",
149
+ "model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors",
150
+ "model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors",
151
+ "model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors",
152
+ "model.layers.23.input_layernorm.weight": "model-00003-of-00003.safetensors",
153
+ "model.layers.23.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
154
+ "model.layers.23.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
155
+ "model.layers.23.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
156
+ "model.layers.23.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
157
+ "model.layers.23.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
158
+ "model.layers.23.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
159
+ "model.layers.23.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
160
+ "model.layers.23.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
161
+ "model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors",
162
+ "model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
163
+ "model.layers.24.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
164
+ "model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
165
+ "model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
166
+ "model.layers.24.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
167
+ "model.layers.24.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
168
+ "model.layers.24.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
169
+ "model.layers.24.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
170
+ "model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors",
171
+ "model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
172
+ "model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
173
+ "model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
174
+ "model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
175
+ "model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
176
+ "model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
177
+ "model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
178
+ "model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
179
+ "model.layers.26.input_layernorm.weight": "model-00003-of-00003.safetensors",
180
+ "model.layers.26.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
181
+ "model.layers.26.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
182
+ "model.layers.26.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
183
+ "model.layers.26.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
184
+ "model.layers.26.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
185
+ "model.layers.26.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
186
+ "model.layers.26.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
187
+ "model.layers.26.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
188
+ "model.layers.27.input_layernorm.weight": "model-00003-of-00003.safetensors",
189
+ "model.layers.27.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
190
+ "model.layers.27.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
191
+ "model.layers.27.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
192
+ "model.layers.27.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
193
+ "model.layers.27.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
194
+ "model.layers.27.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
195
+ "model.layers.27.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
196
+ "model.layers.27.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
197
+ "model.layers.28.input_layernorm.weight": "model-00003-of-00003.safetensors",
198
+ "model.layers.28.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
199
+ "model.layers.28.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
200
+ "model.layers.28.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
201
+ "model.layers.28.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
202
+ "model.layers.28.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
203
+ "model.layers.28.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
204
+ "model.layers.28.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
205
+ "model.layers.28.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
206
+ "model.layers.29.input_layernorm.weight": "model-00003-of-00003.safetensors",
207
+ "model.layers.29.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
208
+ "model.layers.29.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
209
+ "model.layers.29.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
210
+ "model.layers.29.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
211
+ "model.layers.29.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
212
+ "model.layers.29.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
213
+ "model.layers.29.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
214
+ "model.layers.29.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
215
+ "model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors",
216
+ "model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
217
+ "model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
218
+ "model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
219
+ "model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
220
+ "model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
221
+ "model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
222
+ "model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
223
+ "model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
224
+ "model.layers.30.input_layernorm.weight": "model-00003-of-00003.safetensors",
225
+ "model.layers.30.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
226
+ "model.layers.30.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
227
+ "model.layers.30.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
228
+ "model.layers.30.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
229
+ "model.layers.30.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
230
+ "model.layers.30.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
231
+ "model.layers.30.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
232
+ "model.layers.30.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
233
+ "model.layers.31.input_layernorm.weight": "model-00003-of-00003.safetensors",
234
+ "model.layers.31.mlp.down_proj.weight": "model-00003-of-00003.safetensors",
235
+ "model.layers.31.mlp.gate_proj.weight": "model-00003-of-00003.safetensors",
236
+ "model.layers.31.mlp.up_proj.weight": "model-00003-of-00003.safetensors",
237
+ "model.layers.31.post_attention_layernorm.weight": "model-00003-of-00003.safetensors",
238
+ "model.layers.31.self_attn.k_proj.weight": "model-00003-of-00003.safetensors",
239
+ "model.layers.31.self_attn.o_proj.weight": "model-00003-of-00003.safetensors",
240
+ "model.layers.31.self_attn.q_proj.weight": "model-00003-of-00003.safetensors",
241
+ "model.layers.31.self_attn.v_proj.weight": "model-00003-of-00003.safetensors",
242
+ "model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors",
243
+ "model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
244
+ "model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
245
+ "model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
246
+ "model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
247
+ "model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
248
+ "model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
249
+ "model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
250
+ "model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
251
+ "model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors",
252
+ "model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
253
+ "model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
254
+ "model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
255
+ "model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
256
+ "model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
257
+ "model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
258
+ "model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
259
+ "model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
260
+ "model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors",
261
+ "model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
262
+ "model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
263
+ "model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
264
+ "model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
265
+ "model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
266
+ "model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
267
+ "model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
268
+ "model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
269
+ "model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors",
270
+ "model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
271
+ "model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
272
+ "model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
273
+ "model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
274
+ "model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
275
+ "model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
276
+ "model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
277
+ "model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
278
+ "model.layers.8.input_layernorm.weight": "model-00001-of-00003.safetensors",
279
+ "model.layers.8.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
280
+ "model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
281
+ "model.layers.8.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
282
+ "model.layers.8.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
283
+ "model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
284
+ "model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
285
+ "model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
286
+ "model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
287
+ "model.layers.9.input_layernorm.weight": "model-00001-of-00003.safetensors",
288
+ "model.layers.9.mlp.down_proj.weight": "model-00001-of-00003.safetensors",
289
+ "model.layers.9.mlp.gate_proj.weight": "model-00001-of-00003.safetensors",
290
+ "model.layers.9.mlp.up_proj.weight": "model-00001-of-00003.safetensors",
291
+ "model.layers.9.post_attention_layernorm.weight": "model-00001-of-00003.safetensors",
292
+ "model.layers.9.self_attn.k_proj.weight": "model-00001-of-00003.safetensors",
293
+ "model.layers.9.self_attn.o_proj.weight": "model-00001-of-00003.safetensors",
294
+ "model.layers.9.self_attn.q_proj.weight": "model-00001-of-00003.safetensors",
295
+ "model.layers.9.self_attn.v_proj.weight": "model-00001-of-00003.safetensors",
296
+ "model.norm.weight": "model-00003-of-00003.safetensors"
297
+ }
298
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": {
3
+ "content": "<s>",
4
+ "lstrip": false,
5
+ "normalized": false,
6
+ "rstrip": false,
7
+ "single_word": false
8
+ },
9
+ "eos_token": {
10
+ "content": "</s>",
11
+ "lstrip": false,
12
+ "normalized": false,
13
+ "rstrip": false,
14
+ "single_word": false
15
+ },
16
+ "pad_token": "</s>",
17
+ "unk_token": {
18
+ "content": "<unk>",
19
+ "lstrip": false,
20
+ "normalized": false,
21
+ "rstrip": false,
22
+ "single_word": false
23
+ }
24
+ }
tokenizer.json ADDED
The diff for this file is too large to render. See raw diff
 
tokenizer_config.json ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_bos_token": true,
3
+ "add_eos_token": false,
4
+ "add_prefix_space": null,
5
+ "added_tokens_decoder": {
6
+ "0": {
7
+ "content": "<unk>",
8
+ "lstrip": false,
9
+ "normalized": false,
10
+ "rstrip": false,
11
+ "single_word": false,
12
+ "special": true
13
+ },
14
+ "1": {
15
+ "content": "<s>",
16
+ "lstrip": false,
17
+ "normalized": false,
18
+ "rstrip": false,
19
+ "single_word": false,
20
+ "special": true
21
+ },
22
+ "2": {
23
+ "content": "</s>",
24
+ "lstrip": false,
25
+ "normalized": false,
26
+ "rstrip": false,
27
+ "single_word": false,
28
+ "special": true
29
+ }
30
+ },
31
+ "additional_special_tokens": [],
32
+ "bos_token": "<s>",
33
+ "clean_up_tokenization_spaces": false,
34
+ "eos_token": "</s>",
35
+ "extra_special_tokens": {},
36
+ "legacy": false,
37
+ "model_max_length": 1000000000000000019884624838656,
38
+ "pad_token": "</s>",
39
+ "sp_model_kwargs": {},
40
+ "spaces_between_special_tokens": false,
41
+ "tokenizer_class": "LlamaTokenizerFast",
42
+ "unk_token": "<unk>",
43
+ "use_default_system_prompt": false
44
+ }