aborcs commited on
Commit
0ca2503
·
verified ·
1 Parent(s): 8e84937

Model save

Browse files
README.md CHANGED
@@ -18,7 +18,7 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 1.3789
22
 
23
  ## Model description
24
 
@@ -52,130 +52,130 @@ The following hyperparameters were used during training:
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:--------:|:----:|:---------------:|
55
- | No log | 0.8889 | 4 | 1.6761 |
56
- | 2.1258 | 1.8333 | 8 | 1.5718 |
57
- | 1.8837 | 2.7778 | 12 | 1.4430 |
58
- | 1.8445 | 3.9444 | 17 | 1.3135 |
59
- | 1.5461 | 4.8889 | 21 | 1.2238 |
60
- | 1.4965 | 5.8333 | 25 | 1.1570 |
61
- | 1.4965 | 6.7778 | 29 | 1.0861 |
62
- | 1.4955 | 7.9444 | 34 | 1.0166 |
63
- | 1.1551 | 8.8889 | 38 | 0.9551 |
64
- | 1.1221 | 9.8333 | 42 | 0.9056 |
65
- | 0.8809 | 10.7778 | 46 | 0.8736 |
66
- | 0.9244 | 11.9444 | 51 | 0.8220 |
67
- | 0.7266 | 12.8889 | 55 | 0.7924 |
68
- | 0.7266 | 13.8333 | 59 | 0.7717 |
69
- | 0.8095 | 14.7778 | 63 | 0.7837 |
70
- | 0.5916 | 15.9444 | 68 | 0.7544 |
71
- | 0.5728 | 16.8889 | 72 | 0.7691 |
72
- | 0.5031 | 17.8333 | 76 | 0.7770 |
73
- | 0.455 | 18.7778 | 80 | 0.7815 |
74
- | 0.5259 | 19.9444 | 85 | 0.7667 |
75
- | 0.5259 | 20.8889 | 89 | 0.7855 |
76
- | 0.396 | 21.8333 | 93 | 0.8428 |
77
- | 0.3773 | 22.7778 | 97 | 0.8408 |
78
- | 0.2961 | 23.9444 | 102 | 0.8440 |
79
- | 0.301 | 24.8889 | 106 | 0.8690 |
80
- | 0.258 | 25.8333 | 110 | 0.8935 |
81
- | 0.258 | 26.7778 | 114 | 0.9402 |
82
- | 0.276 | 27.9444 | 119 | 0.9261 |
83
- | 0.2121 | 28.8889 | 123 | 0.8469 |
84
- | 0.2091 | 29.8333 | 127 | 0.9639 |
85
- | 0.2017 | 30.7778 | 131 | 0.9492 |
86
- | 0.1768 | 31.9444 | 136 | 1.0007 |
87
- | 0.182 | 32.8889 | 140 | 0.9828 |
88
- | 0.182 | 33.8333 | 144 | 1.0232 |
89
- | 0.1803 | 34.7778 | 148 | 1.0370 |
90
- | 0.1435 | 35.9444 | 153 | 1.0400 |
91
- | 0.1369 | 36.8889 | 157 | 1.0724 |
92
- | 0.1387 | 37.8333 | 161 | 1.0563 |
93
- | 0.1215 | 38.7778 | 165 | 1.1037 |
94
- | 0.1458 | 39.9444 | 170 | 1.0978 |
95
- | 0.1458 | 40.8889 | 174 | 1.1019 |
96
- | 0.1041 | 41.8333 | 178 | 1.0520 |
97
- | 0.0951 | 42.7778 | 182 | 1.1290 |
98
- | 0.1078 | 43.9444 | 187 | 1.0520 |
99
- | 0.0881 | 44.8889 | 191 | 1.0958 |
100
- | 0.0907 | 45.8333 | 195 | 1.1345 |
101
- | 0.0907 | 46.7778 | 199 | 1.1230 |
102
- | 0.0954 | 47.9444 | 204 | 1.1756 |
103
- | 0.0793 | 48.8889 | 208 | 1.1373 |
104
- | 0.0697 | 49.8333 | 212 | 1.2290 |
105
- | 0.0744 | 50.7778 | 216 | 1.1930 |
106
- | 0.069 | 51.9444 | 221 | 1.1722 |
107
- | 0.0605 | 52.8889 | 225 | 1.2236 |
108
- | 0.0605 | 53.8333 | 229 | 1.2027 |
109
- | 0.0647 | 54.7778 | 233 | 1.2243 |
110
- | 0.0554 | 55.9444 | 238 | 1.1833 |
111
- | 0.0501 | 56.8889 | 242 | 1.2003 |
112
- | 0.0507 | 57.8333 | 246 | 1.1891 |
113
- | 0.0462 | 58.7778 | 250 | 1.2117 |
114
- | 0.0565 | 59.9444 | 255 | 1.2295 |
115
- | 0.0565 | 60.8889 | 259 | 1.2175 |
116
- | 0.0431 | 61.8333 | 263 | 1.2548 |
117
- | 0.0407 | 62.7778 | 267 | 1.2471 |
118
- | 0.0442 | 63.9444 | 272 | 1.2386 |
119
- | 0.043 | 64.8889 | 276 | 1.2210 |
120
- | 0.0367 | 65.8333 | 280 | 1.2946 |
121
- | 0.0367 | 66.7778 | 284 | 1.2836 |
122
- | 0.045 | 67.9444 | 289 | 1.2767 |
123
- | 0.0371 | 68.8889 | 293 | 1.2844 |
124
- | 0.0321 | 69.8333 | 297 | 1.2783 |
125
- | 0.0318 | 70.7778 | 301 | 1.3028 |
126
- | 0.0334 | 71.9444 | 306 | 1.2752 |
127
- | 0.0305 | 72.8889 | 310 | 1.2902 |
128
- | 0.0305 | 73.8333 | 314 | 1.2932 |
129
- | 0.0368 | 74.7778 | 318 | 1.2865 |
130
- | 0.0293 | 75.9444 | 323 | 1.2920 |
131
- | 0.0287 | 76.8889 | 327 | 1.3078 |
132
- | 0.0255 | 77.8333 | 331 | 1.3379 |
133
- | 0.0261 | 78.7778 | 335 | 1.3403 |
134
- | 0.0322 | 79.9444 | 340 | 1.3113 |
135
- | 0.0322 | 80.8889 | 344 | 1.3084 |
136
- | 0.0247 | 81.8333 | 348 | 1.3199 |
137
- | 0.0262 | 82.7778 | 352 | 1.3243 |
138
- | 0.0236 | 83.9444 | 357 | 1.3464 |
139
- | 0.0234 | 84.8889 | 361 | 1.3538 |
140
- | 0.0215 | 85.8333 | 365 | 1.3555 |
141
- | 0.0215 | 86.7778 | 369 | 1.3502 |
142
- | 0.0247 | 87.9444 | 374 | 1.3510 |
143
- | 0.0207 | 88.8889 | 378 | 1.3588 |
144
- | 0.0225 | 89.8333 | 382 | 1.3616 |
145
- | 0.0208 | 90.7778 | 386 | 1.3562 |
146
- | 0.0229 | 91.9444 | 391 | 1.3470 |
147
- | 0.024 | 92.8889 | 395 | 1.3535 |
148
- | 0.024 | 93.8333 | 399 | 1.3578 |
149
- | 0.026 | 94.7778 | 403 | 1.3592 |
150
- | 0.0212 | 95.9444 | 408 | 1.3514 |
151
- | 0.0258 | 96.8889 | 412 | 1.3558 |
152
- | 0.0204 | 97.8333 | 416 | 1.3614 |
153
- | 0.0211 | 98.7778 | 420 | 1.3771 |
154
- | 0.0223 | 99.9444 | 425 | 1.3864 |
155
- | 0.0223 | 100.8889 | 429 | 1.3845 |
156
- | 0.0198 | 101.8333 | 433 | 1.3794 |
157
- | 0.0197 | 102.7778 | 437 | 1.3759 |
158
- | 0.0197 | 103.9444 | 442 | 1.3698 |
159
- | 0.0202 | 104.8889 | 446 | 1.3683 |
160
- | 0.0198 | 105.8333 | 450 | 1.3704 |
161
- | 0.0198 | 106.7778 | 454 | 1.3718 |
162
- | 0.0208 | 107.9444 | 459 | 1.3732 |
163
- | 0.0211 | 108.8889 | 463 | 1.3755 |
164
- | 0.0186 | 109.8333 | 467 | 1.3767 |
165
- | 0.0197 | 110.7778 | 471 | 1.3786 |
166
- | 0.0203 | 111.9444 | 476 | 1.3787 |
167
- | 0.0183 | 112.8889 | 480 | 1.3785 |
168
- | 0.0183 | 113.8333 | 484 | 1.3792 |
169
- | 0.0205 | 114.7778 | 488 | 1.3796 |
170
- | 0.0193 | 115.9444 | 493 | 1.3789 |
171
- | 0.0192 | 116.8889 | 497 | 1.3798 |
172
- | 0.0192 | 117.6111 | 500 | 1.3789 |
173
 
174
 
175
  ### Framework versions
176
 
177
  - PEFT 0.13.2
178
- - Transformers 4.46.0
179
- - Pytorch 2.5.0+cu124
180
  - Datasets 3.0.2
181
  - Tokenizers 0.20.1
 
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-8B](https://huggingface.co/meta-llama/Llama-3.1-8B) on the None dataset.
20
  It achieves the following results on the evaluation set:
21
+ - Loss: 1.3304
22
 
23
  ## Model description
24
 
 
52
 
53
  | Training Loss | Epoch | Step | Validation Loss |
54
  |:-------------:|:--------:|:----:|:---------------:|
55
+ | No log | 0.8889 | 4 | 1.6790 |
56
+ | 2.1272 | 1.8333 | 8 | 1.5754 |
57
+ | 1.8869 | 2.7778 | 12 | 1.4449 |
58
+ | 1.8458 | 3.9444 | 17 | 1.3113 |
59
+ | 1.5497 | 4.8889 | 21 | 1.2161 |
60
+ | 1.4996 | 5.8333 | 25 | 1.1479 |
61
+ | 1.4996 | 6.7778 | 29 | 1.0829 |
62
+ | 1.5 | 7.9444 | 34 | 1.0096 |
63
+ | 1.1576 | 8.8889 | 38 | 0.9470 |
64
+ | 1.1188 | 9.8333 | 42 | 0.9070 |
65
+ | 0.881 | 10.7778 | 46 | 0.8688 |
66
+ | 0.9199 | 11.9444 | 51 | 0.8224 |
67
+ | 0.7161 | 12.8889 | 55 | 0.7994 |
68
+ | 0.7161 | 13.8333 | 59 | 0.7957 |
69
+ | 0.7983 | 14.7778 | 63 | 0.7891 |
70
+ | 0.5833 | 15.9444 | 68 | 0.7692 |
71
+ | 0.5577 | 16.8889 | 72 | 0.7593 |
72
+ | 0.4911 | 17.8333 | 76 | 0.7867 |
73
+ | 0.4478 | 18.7778 | 80 | 0.8088 |
74
+ | 0.5181 | 19.9444 | 85 | 0.8089 |
75
+ | 0.5181 | 20.8889 | 89 | 0.7761 |
76
+ | 0.3977 | 21.8333 | 93 | 0.7940 |
77
+ | 0.3655 | 22.7778 | 97 | 0.8387 |
78
+ | 0.293 | 23.9444 | 102 | 0.8603 |
79
+ | 0.2978 | 24.8889 | 106 | 0.8603 |
80
+ | 0.2573 | 25.8333 | 110 | 0.8431 |
81
+ | 0.2573 | 26.7778 | 114 | 0.9431 |
82
+ | 0.2802 | 27.9444 | 119 | 0.9213 |
83
+ | 0.2116 | 28.8889 | 123 | 0.9327 |
84
+ | 0.208 | 29.8333 | 127 | 0.9562 |
85
+ | 0.2012 | 30.7778 | 131 | 0.9036 |
86
+ | 0.1807 | 31.9444 | 136 | 0.9352 |
87
+ | 0.1885 | 32.8889 | 140 | 1.0403 |
88
+ | 0.1885 | 33.8333 | 144 | 0.9444 |
89
+ | 0.1898 | 34.7778 | 148 | 0.9924 |
90
+ | 0.1504 | 35.9444 | 153 | 1.0616 |
91
+ | 0.14 | 36.8889 | 157 | 0.9799 |
92
+ | 0.1428 | 37.8333 | 161 | 1.0503 |
93
+ | 0.1174 | 38.7778 | 165 | 1.0565 |
94
+ | 0.1513 | 39.9444 | 170 | 1.0090 |
95
+ | 0.1513 | 40.8889 | 174 | 1.0892 |
96
+ | 0.1053 | 41.8333 | 178 | 1.0162 |
97
+ | 0.1056 | 42.7778 | 182 | 1.1173 |
98
+ | 0.1127 | 43.9444 | 187 | 1.0811 |
99
+ | 0.0927 | 44.8889 | 191 | 1.0970 |
100
+ | 0.0963 | 45.8333 | 195 | 1.0959 |
101
+ | 0.0963 | 46.7778 | 199 | 1.0603 |
102
+ | 0.1043 | 47.9444 | 204 | 1.1082 |
103
+ | 0.0845 | 48.8889 | 208 | 1.0794 |
104
+ | 0.0728 | 49.8333 | 212 | 1.1056 |
105
+ | 0.0779 | 50.7778 | 216 | 1.1265 |
106
+ | 0.0706 | 51.9444 | 221 | 1.1261 |
107
+ | 0.06 | 52.8889 | 225 | 1.1191 |
108
+ | 0.06 | 53.8333 | 229 | 1.1820 |
109
+ | 0.0692 | 54.7778 | 233 | 1.1651 |
110
+ | 0.0558 | 55.9444 | 238 | 1.1954 |
111
+ | 0.0529 | 56.8889 | 242 | 1.1271 |
112
+ | 0.054 | 57.8333 | 246 | 1.0981 |
113
+ | 0.0491 | 58.7778 | 250 | 1.1937 |
114
+ | 0.0588 | 59.9444 | 255 | 1.1734 |
115
+ | 0.0588 | 60.8889 | 259 | 1.2405 |
116
+ | 0.0435 | 61.8333 | 263 | 1.1687 |
117
+ | 0.0394 | 62.7778 | 267 | 1.1928 |
118
+ | 0.0446 | 63.9444 | 272 | 1.2214 |
119
+ | 0.0414 | 64.8889 | 276 | 1.2216 |
120
+ | 0.0378 | 65.8333 | 280 | 1.2238 |
121
+ | 0.0378 | 66.7778 | 284 | 1.2372 |
122
+ | 0.0455 | 67.9444 | 289 | 1.2214 |
123
+ | 0.0377 | 68.8889 | 293 | 1.2555 |
124
+ | 0.0327 | 69.8333 | 297 | 1.2370 |
125
+ | 0.033 | 70.7778 | 301 | 1.2383 |
126
+ | 0.0342 | 71.9444 | 306 | 1.2499 |
127
+ | 0.032 | 72.8889 | 310 | 1.2769 |
128
+ | 0.032 | 73.8333 | 314 | 1.2521 |
129
+ | 0.0389 | 74.7778 | 318 | 1.2544 |
130
+ | 0.0312 | 75.9444 | 323 | 1.2710 |
131
+ | 0.0294 | 76.8889 | 327 | 1.2853 |
132
+ | 0.0269 | 77.8333 | 331 | 1.2947 |
133
+ | 0.028 | 78.7778 | 335 | 1.3076 |
134
+ | 0.0334 | 79.9444 | 340 | 1.3095 |
135
+ | 0.0334 | 80.8889 | 344 | 1.2938 |
136
+ | 0.0257 | 81.8333 | 348 | 1.2813 |
137
+ | 0.0265 | 82.7778 | 352 | 1.2840 |
138
+ | 0.0262 | 83.9444 | 357 | 1.2902 |
139
+ | 0.0243 | 84.8889 | 361 | 1.3001 |
140
+ | 0.0232 | 85.8333 | 365 | 1.3042 |
141
+ | 0.0232 | 86.7778 | 369 | 1.3044 |
142
+ | 0.027 | 87.9444 | 374 | 1.2909 |
143
+ | 0.0224 | 88.8889 | 378 | 1.2925 |
144
+ | 0.0239 | 89.8333 | 382 | 1.2949 |
145
+ | 0.0221 | 90.7778 | 386 | 1.3046 |
146
+ | 0.0244 | 91.9444 | 391 | 1.3120 |
147
+ | 0.0256 | 92.8889 | 395 | 1.3179 |
148
+ | 0.0256 | 93.8333 | 399 | 1.3150 |
149
+ | 0.0276 | 94.7778 | 403 | 1.3069 |
150
+ | 0.0226 | 95.9444 | 408 | 1.2978 |
151
+ | 0.0279 | 96.8889 | 412 | 1.2995 |
152
+ | 0.0218 | 97.8333 | 416 | 1.3054 |
153
+ | 0.0224 | 98.7778 | 420 | 1.3163 |
154
+ | 0.0236 | 99.9444 | 425 | 1.3296 |
155
+ | 0.0236 | 100.8889 | 429 | 1.3317 |
156
+ | 0.021 | 101.8333 | 433 | 1.3305 |
157
+ | 0.0208 | 102.7778 | 437 | 1.3273 |
158
+ | 0.0205 | 103.9444 | 442 | 1.3253 |
159
+ | 0.0213 | 104.8889 | 446 | 1.3249 |
160
+ | 0.0208 | 105.8333 | 450 | 1.3257 |
161
+ | 0.0208 | 106.7778 | 454 | 1.3263 |
162
+ | 0.0221 | 107.9444 | 459 | 1.3271 |
163
+ | 0.0223 | 108.8889 | 463 | 1.3279 |
164
+ | 0.0194 | 109.8333 | 467 | 1.3291 |
165
+ | 0.0207 | 110.7778 | 471 | 1.3293 |
166
+ | 0.0211 | 111.9444 | 476 | 1.3296 |
167
+ | 0.0193 | 112.8889 | 480 | 1.3302 |
168
+ | 0.0193 | 113.8333 | 484 | 1.3301 |
169
+ | 0.0217 | 114.7778 | 488 | 1.3295 |
170
+ | 0.0201 | 115.9444 | 493 | 1.3301 |
171
+ | 0.0201 | 116.8889 | 497 | 1.3305 |
172
+ | 0.0201 | 117.6111 | 500 | 1.3304 |
173
 
174
 
175
  ### Framework versions
176
 
177
  - PEFT 0.13.2
178
+ - Transformers 4.46.1
179
+ - Pytorch 2.5.1+cu124
180
  - Datasets 3.0.2
181
  - Tokenizers 0.20.1
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:8673372c6d4cba21fa16cb3a3c7056a56e343cd7378b7c783e8ff82d95d6f62b
3
  size 13648432
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:eed48fc26a68a7aee75f4ab77a1f6eb5a6d56cf3998aa80e75198812aa79109a
3
  size 13648432
runs/Oct31_13-57-23_buda/events.out.tfevents.1730383046.buda.1.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d6b28eda11f7f646f47351151e56a73c765bcc8e30ef9e765d6e294ff5b334d9
3
- size 58697
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5298e55c6e663c6c6f24eedbc23881fc017c61976bc1831bfbe39c294d2bef91
3
+ size 59322