mik3ml commited on
Commit
de3f36c
·
verified ·
1 Parent(s): 690770d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +41 -45
README.md CHANGED
@@ -5,17 +5,53 @@ base_model: Qwen/Qwen2.5-0.5B-Instruct
5
  tags:
6
  - generated_from_trainer
7
  - axolotl
8
- model-index:
9
- - name: outputs/qwen05B
10
- results: []
11
  language:
12
  - it
13
  - en
14
  pipeline_tag: text-generation
15
  ---
16
 
17
- <!-- This model card has been generated automatically according to the information the Trainer had access to. You
18
- should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
19
 
20
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
21
  <details><summary>See axolotl config</summary>
@@ -149,46 +185,6 @@ special_tokens:
149
 
150
  </details><br>
151
 
152
- # outputs/qwen05B
153
-
154
- This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) on the [ReDiX/DataForge](https://huggingface.co/datasets/ReDiX/DataForge) dataset.
155
- It achieves the following results on the evaluation set:
156
- - Loss: 1.4100
157
-
158
- ## Model description
159
-
160
- More information needed
161
-
162
- ## Intended uses & limitations
163
-
164
- More information needed
165
-
166
- ## Training and evaluation data
167
-
168
-
169
- | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
170
- |------------|------:|------|-----:|--------|---|-----:|---|-----:|
171
- |arc_it | 2|none | 0|acc |↑ |0.2378|± |0.0125|
172
- | | |none | 0|acc_norm|↑ |0.2823|± |0.0132|
173
- |hellaswag_it| 1|none | 0|acc |↑ |0.3163|± |0.0049|
174
- | | |none | 0|acc_norm|↑ |0.3800|± |0.0051|
175
- |m_mmlu_it | 0|none | 5|acc |↑ |0.381 |± |0.0042|
176
-
177
- ## Training procedure
178
-
179
- ### Training hyperparameters
180
-
181
- The following hyperparameters were used during training:
182
- - learning_rate: 0.0001
183
- - train_batch_size: 4
184
- - eval_batch_size: 4
185
- - seed: 42
186
- - gradient_accumulation_steps: 4
187
- - total_train_batch_size: 16
188
- - optimizer: Use adamw_bnb_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
189
- - lr_scheduler_type: cosine
190
- - lr_scheduler_warmup_steps: 10
191
- - num_epochs: 2
192
 
193
  ### Training results
194
 
 
5
  tags:
6
  - generated_from_trainer
7
  - axolotl
 
 
 
8
  language:
9
  - it
10
  - en
11
  pipeline_tag: text-generation
12
  ---
13
 
14
+ # Qwen2.5-0.5B-Instruct-ITA
15
+
16
+ This model is a fine-tuned version of [Qwen/Qwen2.5-0.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-0.5B-Instruct) on the [ReDiX/DataForge](https://huggingface.co/datasets/ReDiX/DataForge) dataset.
17
+ It achieves the following results on the evaluation set:
18
+ - Loss: 1.4100
19
+
20
+ ## Model description
21
+
22
+ This model is an example of finetuning a sLLM. Italian eval improved and the model learned as espected from the training data
23
+
24
+ ## Intended uses & limitations
25
+
26
+ More information needed
27
+
28
+ ## Training and evaluation data
29
+
30
+
31
+ | Tasks |Version|Filter|n-shot| Metric | |Value | |Stderr|
32
+ |------------|------:|------|-----:|--------|---|-----:|---|-----:|
33
+ |arc_it | 2|none | 0|acc |↑ |0.2378|± |0.0125|
34
+ | | |none | 0|acc_norm|↑ |0.2823|± |0.0132|
35
+ |hellaswag_it| 1|none | 0|acc |↑ |0.3163|± |0.0049|
36
+ | | |none | 0|acc_norm|↑ |0.3800|± |0.0051|
37
+ |m_mmlu_it | 0|none | 5|acc |↑ |0.381 |± |0.0042|
38
+
39
+ ## Training procedure
40
+
41
+ ### Training hyperparameters
42
+
43
+ The following hyperparameters were used during training:
44
+ - learning_rate: 0.0001
45
+ - train_batch_size: 4
46
+ - eval_batch_size: 4
47
+ - seed: 42
48
+ - gradient_accumulation_steps: 4
49
+ - total_train_batch_size: 16
50
+ - optimizer: Use adamw_bnb_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
51
+ - lr_scheduler_type: cosine
52
+ - lr_scheduler_warmup_steps: 10
53
+ - num_epochs: 2
54
+
55
 
56
  [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
57
  <details><summary>See axolotl config</summary>
 
185
 
186
  </details><br>
187
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
188
 
189
  ### Training results
190