File size: 12,300 Bytes
b68a910
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
---
library_name: transformers
license: mit
base_model: sbintuitions/modernbert-ja-310m
tags:
- generated_from_trainer
metrics:
- pearsonr
- spearmanr
model-index:
- name: test-clf-modernbert-310m
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# test-clf-modernbert-310m

This model is a fine-tuned version of [sbintuitions/modernbert-ja-310m](https://huggingface.co/sbintuitions/modernbert-ja-310m) on an unknown dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0892
- Mae: 0.7812
- R2: 0.3990
- Pearsonr: 0.6383
- Spearmanr: 0.6272

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine_with_min_lr
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Mae    | R2      | Pearsonr | Spearmanr |
|:-------------:|:------:|:----:|:---------------:|:------:|:-------:|:--------:|:---------:|
| 38.8642       | 0.0440 | 30   | 7.6612          | 2.1780 | -2.7511 | 0.0043   | -0.0027   |
| 10.4776       | 0.0880 | 60   | 2.3989          | 1.1914 | -0.1745 | 0.2921   | 0.2652    |
| 3.6459        | 0.1320 | 90   | 2.3098          | 1.2163 | -0.1310 | 0.4206   | 0.4163    |
| 5.807         | 0.1760 | 120  | 2.0340          | 1.1310 | 0.0041  | 0.4436   | 0.4267    |
| 5.2356        | 0.2199 | 150  | 1.9798          | 1.0620 | 0.0306  | 0.4974   | 0.4648    |
| 4.4142        | 0.2639 | 180  | 1.7722          | 1.0213 | 0.1323  | 0.5154   | 0.4874    |
| 4.9085        | 0.3079 | 210  | 6.8826          | 2.3366 | -2.3699 | 0.5232   | 0.5094    |
| 8.0726        | 0.3519 | 240  | 1.4623          | 0.9226 | 0.2840  | 0.5386   | 0.5242    |
| 29.2783       | 0.3959 | 270  | 4.0754          | 1.6163 | -0.9954 | 0.4770   | 0.4772    |
| 5.8973        | 0.4399 | 300  | 3.0100          | 1.3312 | -0.4738 | 0.5204   | 0.5089    |
| 3.3493        | 0.4839 | 330  | 1.4475          | 0.8710 | 0.2913  | 0.5574   | 0.5520    |
| 6.7682        | 0.5279 | 360  | 1.3851          | 0.8808 | 0.3218  | 0.5715   | 0.5572    |
| 4.3158        | 0.5718 | 390  | 2.2720          | 1.2907 | -0.1124 | 0.5504   | 0.5330    |
| 15.823        | 0.6158 | 420  | 4.1442          | 1.7054 | -1.0291 | 0.5797   | 0.5572    |
| 8.0344        | 0.6598 | 450  | 2.7629          | 1.3644 | -0.3528 | 0.5669   | 0.5553    |
| 3.171         | 0.7038 | 480  | 2.0582          | 1.1288 | -0.0078 | 0.5012   | 0.5405    |
| 6.7538        | 0.7478 | 510  | 1.6033          | 1.0240 | 0.2150  | 0.5934   | 0.5790    |
| 6.1151        | 0.7918 | 540  | 1.6594          | 1.0670 | 0.1875  | 0.5697   | 0.5401    |
| 2.5472        | 0.8358 | 570  | 1.7069          | 1.0674 | 0.1643  | 0.5920   | 0.5763    |
| 3.9392        | 0.8798 | 600  | 2.1113          | 1.2292 | -0.0337 | 0.5871   | 0.5767    |
| 3.9147        | 0.9238 | 630  | 1.3118          | 0.8620 | 0.3577  | 0.6115   | 0.5876    |
| 5.7769        | 0.9677 | 660  | 2.6878          | 1.2831 | -0.3160 | 0.5764   | 0.5612    |
| 5.0716        | 1.0117 | 690  | 2.1133          | 1.0984 | -0.0347 | 0.6059   | 0.5900    |
| 4.7065        | 1.0557 | 720  | 2.6758          | 1.3888 | -0.3101 | 0.6116   | 0.5974    |
| 1.6835        | 1.0997 | 750  | 1.2992          | 0.8625 | 0.3639  | 0.6090   | 0.5826    |
| 5.2112        | 1.1437 | 780  | 1.8851          | 1.1165 | 0.0770  | 0.5974   | 0.5784    |
| 2.7997        | 1.1877 | 810  | 1.4227          | 0.9234 | 0.3034  | 0.6077   | 0.5810    |
| 1.9417        | 1.2317 | 840  | 1.5027          | 0.9326 | 0.2642  | 0.6310   | 0.6065    |
| 2.8662        | 1.2757 | 870  | 1.3368          | 0.8925 | 0.3454  | 0.6140   | 0.5774    |
| 4.2357        | 1.3196 | 900  | 2.6313          | 1.4141 | -0.2883 | 0.6385   | 0.6103    |
| 7.8053        | 1.3636 | 930  | 1.6020          | 0.9218 | 0.2156  | 0.6347   | 0.6080    |
| 1.1231        | 1.4076 | 960  | 1.4656          | 0.9488 | 0.2824  | 0.6385   | 0.6122    |
| 5.6334        | 1.4516 | 990  | 1.3516          | 0.9137 | 0.3382  | 0.6426   | 0.6221    |
| 4.371         | 1.4956 | 1020 | 2.6421          | 1.4260 | -0.2937 | 0.6369   | 0.6152    |
| 3.9286        | 1.5396 | 1050 | 1.4988          | 0.9515 | 0.2661  | 0.6398   | 0.6191    |
| 2.2357        | 1.5836 | 1080 | 1.3611          | 0.9070 | 0.3336  | 0.6290   | 0.6100    |
| 7.9489        | 1.6276 | 1110 | 1.2121          | 0.8059 | 0.4065  | 0.6418   | 0.6175    |
| 6.065         | 1.6716 | 1140 | 1.2714          | 0.8813 | 0.3775  | 0.6513   | 0.6241    |
| 2.1338        | 1.7155 | 1170 | 1.2413          | 0.8370 | 0.3922  | 0.6338   | 0.6065    |
| 2.5689        | 1.7595 | 1200 | 1.7681          | 1.0914 | 0.1343  | 0.6437   | 0.6228    |
| 1.4487        | 1.8035 | 1230 | 1.9605          | 1.1252 | 0.0401  | 0.6136   | 0.5836    |
| 2.2018        | 1.8475 | 1260 | 2.9671          | 1.5227 | -0.4528 | 0.6329   | 0.6100    |
| 2.8964        | 1.8915 | 1290 | 1.6779          | 1.0542 | 0.1784  | 0.6384   | 0.6163    |
| 2.1872        | 1.9355 | 1320 | 1.2393          | 0.8072 | 0.3932  | 0.6459   | 0.6272    |
| 3.2919        | 1.9795 | 1350 | 2.7018          | 1.4239 | -0.3229 | 0.6401   | 0.6227    |
| 2.5316        | 2.0235 | 1380 | 1.3240          | 0.8902 | 0.3517  | 0.6484   | 0.6285    |
| 2.0354        | 2.0674 | 1410 | 1.4146          | 0.9048 | 0.3074  | 0.6344   | 0.6130    |
| 2.9549        | 2.1114 | 1440 | 1.2957          | 0.8381 | 0.3656  | 0.6393   | 0.6228    |
| 3.5482        | 2.1554 | 1470 | 1.2744          | 0.8478 | 0.3760  | 0.6287   | 0.6077    |
| 2.3728        | 2.1994 | 1500 | 1.6528          | 1.0318 | 0.1907  | 0.6351   | 0.6166    |
| 2.9036        | 2.2434 | 1530 | 1.6116          | 1.0098 | 0.2109  | 0.6387   | 0.6141    |
| 2.4741        | 2.2874 | 1560 | 1.5921          | 1.0000 | 0.2204  | 0.6528   | 0.6346    |
| 1.3401        | 2.3314 | 1590 | 1.2849          | 0.8326 | 0.3709  | 0.6425   | 0.6294    |
| 2.1981        | 2.3754 | 1620 | 2.0894          | 1.1972 | -0.0230 | 0.6428   | 0.6306    |
| 3.6077        | 2.4194 | 1650 | 1.2730          | 0.8461 | 0.3767  | 0.6411   | 0.6263    |
| 1.2494        | 2.4633 | 1680 | 1.3331          | 0.8805 | 0.3473  | 0.6520   | 0.6388    |
| 1.6448        | 2.5073 | 1710 | 1.8776          | 1.1258 | 0.0807  | 0.6539   | 0.6358    |
| 1.6004        | 2.5513 | 1740 | 1.6464          | 1.0332 | 0.1939  | 0.6457   | 0.6231    |
| 2.6825        | 2.5953 | 1770 | 1.2436          | 0.8325 | 0.3911  | 0.6517   | 0.6305    |
| 4.1015        | 2.6393 | 1800 | 1.8048          | 1.1235 | 0.1163  | 0.6490   | 0.6281    |
| 2.3947        | 2.6833 | 1830 | 2.1353          | 1.2060 | -0.0455 | 0.6513   | 0.6283    |
| 3.6517        | 2.7273 | 1860 | 2.2012          | 1.2143 | -0.0778 | 0.6511   | 0.6259    |
| 1.283         | 2.7713 | 1890 | 1.4102          | 0.9454 | 0.3095  | 0.6475   | 0.6209    |
| 3.372         | 2.8152 | 1920 | 1.2497          | 0.8385 | 0.3881  | 0.6544   | 0.6310    |
| 0.9015        | 2.8592 | 1950 | 1.5059          | 0.9694 | 0.2627  | 0.6439   | 0.6275    |
| 2.1263        | 2.9032 | 1980 | 1.2574          | 0.8277 | 0.3844  | 0.6561   | 0.6392    |
| 1.7678        | 2.9472 | 2010 | 1.2511          | 0.8340 | 0.3874  | 0.6547   | 0.6378    |
| 0.8637        | 2.9912 | 2040 | 1.3555          | 0.8935 | 0.3363  | 0.6452   | 0.6275    |
| 1.1866        | 3.0352 | 2070 | 1.2389          | 0.8230 | 0.3934  | 0.6519   | 0.6355    |
| 1.521         | 3.0792 | 2100 | 1.3950          | 0.9128 | 0.3170  | 0.6416   | 0.6268    |
| 1.3431        | 3.1232 | 2130 | 1.3883          | 0.9162 | 0.3203  | 0.6406   | 0.6282    |
| 1.6443        | 3.1672 | 2160 | 1.2446          | 0.8213 | 0.3906  | 0.6430   | 0.6284    |
| 2.2007        | 3.2111 | 2190 | 1.4758          | 0.9392 | 0.2774  | 0.6456   | 0.6316    |
| 1.24          | 3.2551 | 2220 | 1.5468          | 0.9892 | 0.2426  | 0.6458   | 0.6308    |
| 0.7113        | 3.2991 | 2250 | 1.2618          | 0.8316 | 0.3822  | 0.6454   | 0.6275    |
| 1.9999        | 3.3431 | 2280 | 1.6327          | 1.0221 | 0.2006  | 0.6493   | 0.6304    |
| 0.4573        | 3.3871 | 2310 | 1.2183          | 0.8150 | 0.4035  | 0.6497   | 0.6301    |
| 0.1997        | 3.4311 | 2340 | 1.2584          | 0.8476 | 0.3838  | 0.6401   | 0.6219    |
| 0.6893        | 3.4751 | 2370 | 1.3907          | 0.9077 | 0.3191  | 0.6507   | 0.6344    |
| 2.5815        | 3.5191 | 2400 | 1.5668          | 0.9990 | 0.2329  | 0.6503   | 0.6342    |
| 0.5047        | 3.5630 | 2430 | 1.2605          | 0.8514 | 0.3828  | 0.6490   | 0.6313    |
| 0.6636        | 3.6070 | 2460 | 1.4618          | 0.9461 | 0.2843  | 0.6492   | 0.6363    |
| 0.6637        | 3.6510 | 2490 | 1.4765          | 0.9607 | 0.2770  | 0.6476   | 0.6356    |
| 0.9363        | 3.6950 | 2520 | 1.2501          | 0.8259 | 0.3879  | 0.6498   | 0.6337    |
| 0.7925        | 3.7390 | 2550 | 1.3660          | 0.8958 | 0.3312  | 0.6462   | 0.6318    |
| 1.8824        | 3.7830 | 2580 | 1.3078          | 0.8686 | 0.3597  | 0.6446   | 0.6312    |
| 1.4881        | 3.8270 | 2610 | 1.6678          | 1.0378 | 0.1834  | 0.6427   | 0.6292    |
| 1.2663        | 3.8710 | 2640 | 2.0540          | 1.1969 | -0.0057 | 0.6404   | 0.6242    |
| 0.9128        | 3.9150 | 2670 | 1.2595          | 0.8179 | 0.3833  | 0.6438   | 0.6273    |
| 1.3534        | 3.9589 | 2700 | 1.3228          | 0.8648 | 0.3523  | 0.6383   | 0.6224    |
| 0.3248        | 4.0029 | 2730 | 1.6017          | 0.9971 | 0.2157  | 0.6424   | 0.6260    |
| 0.4408        | 4.0469 | 2760 | 1.2523          | 0.8347 | 0.3868  | 0.6474   | 0.6290    |
| 0.6593        | 4.0909 | 2790 | 1.2593          | 0.8396 | 0.3834  | 0.6453   | 0.6277    |
| 0.5935        | 4.1349 | 2820 | 1.3069          | 0.8725 | 0.3601  | 0.6438   | 0.6277    |
| 0.5308        | 4.1789 | 2850 | 1.2745          | 0.8521 | 0.3760  | 0.6449   | 0.6290    |
| 0.94          | 4.2229 | 2880 | 1.3047          | 0.8737 | 0.3612  | 0.6448   | 0.6289    |
| 0.6516        | 4.2669 | 2910 | 1.4950          | 0.9587 | 0.2680  | 0.6452   | 0.6315    |
| 0.1789        | 4.3109 | 2940 | 1.3578          | 0.8991 | 0.3352  | 0.6453   | 0.6288    |
| 0.5594        | 4.3548 | 2970 | 1.4207          | 0.9304 | 0.3044  | 0.6458   | 0.6298    |
| 0.3357        | 4.3988 | 3000 | 1.5353          | 0.9849 | 0.2483  | 0.6452   | 0.6282    |
| 0.1883        | 4.4428 | 3030 | 1.4177          | 0.9274 | 0.3059  | 0.6483   | 0.6326    |
| 0.3584        | 4.4868 | 3060 | 1.3492          | 0.8908 | 0.3394  | 0.6498   | 0.6348    |
| 0.51          | 4.5308 | 3090 | 1.3724          | 0.9032 | 0.3280  | 0.6479   | 0.6324    |
| 0.2909        | 4.5748 | 3120 | 1.3617          | 0.8998 | 0.3333  | 0.6460   | 0.6302    |
| 0.4247        | 4.6188 | 3150 | 1.3533          | 0.8985 | 0.3374  | 0.6485   | 0.6334    |
| 0.5367        | 4.6628 | 3180 | 1.3397          | 0.8856 | 0.3441  | 0.6456   | 0.6312    |
| 0.4184        | 4.7067 | 3210 | 1.3487          | 0.8928 | 0.3396  | 0.6458   | 0.6306    |
| 0.2521        | 4.7507 | 3240 | 1.3022          | 0.8580 | 0.3624  | 0.6462   | 0.6307    |
| 0.2434        | 4.7947 | 3270 | 1.5001          | 0.9638 | 0.2655  | 0.6450   | 0.6305    |
| 0.2547        | 4.8387 | 3300 | 1.3812          | 0.9053 | 0.3237  | 0.6452   | 0.6300    |
| 0.9901        | 4.8827 | 3330 | 1.4053          | 0.9147 | 0.3119  | 0.6449   | 0.6292    |
| 0.1669        | 4.9267 | 3360 | 1.3150          | 0.8729 | 0.3561  | 0.6473   | 0.6319    |
| 0.3208        | 4.9707 | 3390 | 1.2870          | 0.8580 | 0.3698  | 0.6495   | 0.6346    |


### Framework versions

- Transformers 4.49.0
- Pytorch 2.4.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0