metadata
library_name: transformers
license: mit
base_model: sbintuitions/modernbert-ja-310m
tags:
- generated_from_trainer
metrics:
- pearsonr
- spearmanr
model-index:
- name: test-clf-modernbert-310m
results: []
test-clf-modernbert-310m
This model is a fine-tuned version of sbintuitions/modernbert-ja-310m on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.0892
- Mae: 0.7812
- R2: 0.3990
- Pearsonr: 0.6383
- Spearmanr: 0.6272
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- gradient_accumulation_steps: 2
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine_with_min_lr
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 5
Training results
Training Loss | Epoch | Step | Validation Loss | Mae | R2 | Pearsonr | Spearmanr |
---|---|---|---|---|---|---|---|
38.8642 | 0.0440 | 30 | 7.6612 | 2.1780 | -2.7511 | 0.0043 | -0.0027 |
10.4776 | 0.0880 | 60 | 2.3989 | 1.1914 | -0.1745 | 0.2921 | 0.2652 |
3.6459 | 0.1320 | 90 | 2.3098 | 1.2163 | -0.1310 | 0.4206 | 0.4163 |
5.807 | 0.1760 | 120 | 2.0340 | 1.1310 | 0.0041 | 0.4436 | 0.4267 |
5.2356 | 0.2199 | 150 | 1.9798 | 1.0620 | 0.0306 | 0.4974 | 0.4648 |
4.4142 | 0.2639 | 180 | 1.7722 | 1.0213 | 0.1323 | 0.5154 | 0.4874 |
4.9085 | 0.3079 | 210 | 6.8826 | 2.3366 | -2.3699 | 0.5232 | 0.5094 |
8.0726 | 0.3519 | 240 | 1.4623 | 0.9226 | 0.2840 | 0.5386 | 0.5242 |
29.2783 | 0.3959 | 270 | 4.0754 | 1.6163 | -0.9954 | 0.4770 | 0.4772 |
5.8973 | 0.4399 | 300 | 3.0100 | 1.3312 | -0.4738 | 0.5204 | 0.5089 |
3.3493 | 0.4839 | 330 | 1.4475 | 0.8710 | 0.2913 | 0.5574 | 0.5520 |
6.7682 | 0.5279 | 360 | 1.3851 | 0.8808 | 0.3218 | 0.5715 | 0.5572 |
4.3158 | 0.5718 | 390 | 2.2720 | 1.2907 | -0.1124 | 0.5504 | 0.5330 |
15.823 | 0.6158 | 420 | 4.1442 | 1.7054 | -1.0291 | 0.5797 | 0.5572 |
8.0344 | 0.6598 | 450 | 2.7629 | 1.3644 | -0.3528 | 0.5669 | 0.5553 |
3.171 | 0.7038 | 480 | 2.0582 | 1.1288 | -0.0078 | 0.5012 | 0.5405 |
6.7538 | 0.7478 | 510 | 1.6033 | 1.0240 | 0.2150 | 0.5934 | 0.5790 |
6.1151 | 0.7918 | 540 | 1.6594 | 1.0670 | 0.1875 | 0.5697 | 0.5401 |
2.5472 | 0.8358 | 570 | 1.7069 | 1.0674 | 0.1643 | 0.5920 | 0.5763 |
3.9392 | 0.8798 | 600 | 2.1113 | 1.2292 | -0.0337 | 0.5871 | 0.5767 |
3.9147 | 0.9238 | 630 | 1.3118 | 0.8620 | 0.3577 | 0.6115 | 0.5876 |
5.7769 | 0.9677 | 660 | 2.6878 | 1.2831 | -0.3160 | 0.5764 | 0.5612 |
5.0716 | 1.0117 | 690 | 2.1133 | 1.0984 | -0.0347 | 0.6059 | 0.5900 |
4.7065 | 1.0557 | 720 | 2.6758 | 1.3888 | -0.3101 | 0.6116 | 0.5974 |
1.6835 | 1.0997 | 750 | 1.2992 | 0.8625 | 0.3639 | 0.6090 | 0.5826 |
5.2112 | 1.1437 | 780 | 1.8851 | 1.1165 | 0.0770 | 0.5974 | 0.5784 |
2.7997 | 1.1877 | 810 | 1.4227 | 0.9234 | 0.3034 | 0.6077 | 0.5810 |
1.9417 | 1.2317 | 840 | 1.5027 | 0.9326 | 0.2642 | 0.6310 | 0.6065 |
2.8662 | 1.2757 | 870 | 1.3368 | 0.8925 | 0.3454 | 0.6140 | 0.5774 |
4.2357 | 1.3196 | 900 | 2.6313 | 1.4141 | -0.2883 | 0.6385 | 0.6103 |
7.8053 | 1.3636 | 930 | 1.6020 | 0.9218 | 0.2156 | 0.6347 | 0.6080 |
1.1231 | 1.4076 | 960 | 1.4656 | 0.9488 | 0.2824 | 0.6385 | 0.6122 |
5.6334 | 1.4516 | 990 | 1.3516 | 0.9137 | 0.3382 | 0.6426 | 0.6221 |
4.371 | 1.4956 | 1020 | 2.6421 | 1.4260 | -0.2937 | 0.6369 | 0.6152 |
3.9286 | 1.5396 | 1050 | 1.4988 | 0.9515 | 0.2661 | 0.6398 | 0.6191 |
2.2357 | 1.5836 | 1080 | 1.3611 | 0.9070 | 0.3336 | 0.6290 | 0.6100 |
7.9489 | 1.6276 | 1110 | 1.2121 | 0.8059 | 0.4065 | 0.6418 | 0.6175 |
6.065 | 1.6716 | 1140 | 1.2714 | 0.8813 | 0.3775 | 0.6513 | 0.6241 |
2.1338 | 1.7155 | 1170 | 1.2413 | 0.8370 | 0.3922 | 0.6338 | 0.6065 |
2.5689 | 1.7595 | 1200 | 1.7681 | 1.0914 | 0.1343 | 0.6437 | 0.6228 |
1.4487 | 1.8035 | 1230 | 1.9605 | 1.1252 | 0.0401 | 0.6136 | 0.5836 |
2.2018 | 1.8475 | 1260 | 2.9671 | 1.5227 | -0.4528 | 0.6329 | 0.6100 |
2.8964 | 1.8915 | 1290 | 1.6779 | 1.0542 | 0.1784 | 0.6384 | 0.6163 |
2.1872 | 1.9355 | 1320 | 1.2393 | 0.8072 | 0.3932 | 0.6459 | 0.6272 |
3.2919 | 1.9795 | 1350 | 2.7018 | 1.4239 | -0.3229 | 0.6401 | 0.6227 |
2.5316 | 2.0235 | 1380 | 1.3240 | 0.8902 | 0.3517 | 0.6484 | 0.6285 |
2.0354 | 2.0674 | 1410 | 1.4146 | 0.9048 | 0.3074 | 0.6344 | 0.6130 |
2.9549 | 2.1114 | 1440 | 1.2957 | 0.8381 | 0.3656 | 0.6393 | 0.6228 |
3.5482 | 2.1554 | 1470 | 1.2744 | 0.8478 | 0.3760 | 0.6287 | 0.6077 |
2.3728 | 2.1994 | 1500 | 1.6528 | 1.0318 | 0.1907 | 0.6351 | 0.6166 |
2.9036 | 2.2434 | 1530 | 1.6116 | 1.0098 | 0.2109 | 0.6387 | 0.6141 |
2.4741 | 2.2874 | 1560 | 1.5921 | 1.0000 | 0.2204 | 0.6528 | 0.6346 |
1.3401 | 2.3314 | 1590 | 1.2849 | 0.8326 | 0.3709 | 0.6425 | 0.6294 |
2.1981 | 2.3754 | 1620 | 2.0894 | 1.1972 | -0.0230 | 0.6428 | 0.6306 |
3.6077 | 2.4194 | 1650 | 1.2730 | 0.8461 | 0.3767 | 0.6411 | 0.6263 |
1.2494 | 2.4633 | 1680 | 1.3331 | 0.8805 | 0.3473 | 0.6520 | 0.6388 |
1.6448 | 2.5073 | 1710 | 1.8776 | 1.1258 | 0.0807 | 0.6539 | 0.6358 |
1.6004 | 2.5513 | 1740 | 1.6464 | 1.0332 | 0.1939 | 0.6457 | 0.6231 |
2.6825 | 2.5953 | 1770 | 1.2436 | 0.8325 | 0.3911 | 0.6517 | 0.6305 |
4.1015 | 2.6393 | 1800 | 1.8048 | 1.1235 | 0.1163 | 0.6490 | 0.6281 |
2.3947 | 2.6833 | 1830 | 2.1353 | 1.2060 | -0.0455 | 0.6513 | 0.6283 |
3.6517 | 2.7273 | 1860 | 2.2012 | 1.2143 | -0.0778 | 0.6511 | 0.6259 |
1.283 | 2.7713 | 1890 | 1.4102 | 0.9454 | 0.3095 | 0.6475 | 0.6209 |
3.372 | 2.8152 | 1920 | 1.2497 | 0.8385 | 0.3881 | 0.6544 | 0.6310 |
0.9015 | 2.8592 | 1950 | 1.5059 | 0.9694 | 0.2627 | 0.6439 | 0.6275 |
2.1263 | 2.9032 | 1980 | 1.2574 | 0.8277 | 0.3844 | 0.6561 | 0.6392 |
1.7678 | 2.9472 | 2010 | 1.2511 | 0.8340 | 0.3874 | 0.6547 | 0.6378 |
0.8637 | 2.9912 | 2040 | 1.3555 | 0.8935 | 0.3363 | 0.6452 | 0.6275 |
1.1866 | 3.0352 | 2070 | 1.2389 | 0.8230 | 0.3934 | 0.6519 | 0.6355 |
1.521 | 3.0792 | 2100 | 1.3950 | 0.9128 | 0.3170 | 0.6416 | 0.6268 |
1.3431 | 3.1232 | 2130 | 1.3883 | 0.9162 | 0.3203 | 0.6406 | 0.6282 |
1.6443 | 3.1672 | 2160 | 1.2446 | 0.8213 | 0.3906 | 0.6430 | 0.6284 |
2.2007 | 3.2111 | 2190 | 1.4758 | 0.9392 | 0.2774 | 0.6456 | 0.6316 |
1.24 | 3.2551 | 2220 | 1.5468 | 0.9892 | 0.2426 | 0.6458 | 0.6308 |
0.7113 | 3.2991 | 2250 | 1.2618 | 0.8316 | 0.3822 | 0.6454 | 0.6275 |
1.9999 | 3.3431 | 2280 | 1.6327 | 1.0221 | 0.2006 | 0.6493 | 0.6304 |
0.4573 | 3.3871 | 2310 | 1.2183 | 0.8150 | 0.4035 | 0.6497 | 0.6301 |
0.1997 | 3.4311 | 2340 | 1.2584 | 0.8476 | 0.3838 | 0.6401 | 0.6219 |
0.6893 | 3.4751 | 2370 | 1.3907 | 0.9077 | 0.3191 | 0.6507 | 0.6344 |
2.5815 | 3.5191 | 2400 | 1.5668 | 0.9990 | 0.2329 | 0.6503 | 0.6342 |
0.5047 | 3.5630 | 2430 | 1.2605 | 0.8514 | 0.3828 | 0.6490 | 0.6313 |
0.6636 | 3.6070 | 2460 | 1.4618 | 0.9461 | 0.2843 | 0.6492 | 0.6363 |
0.6637 | 3.6510 | 2490 | 1.4765 | 0.9607 | 0.2770 | 0.6476 | 0.6356 |
0.9363 | 3.6950 | 2520 | 1.2501 | 0.8259 | 0.3879 | 0.6498 | 0.6337 |
0.7925 | 3.7390 | 2550 | 1.3660 | 0.8958 | 0.3312 | 0.6462 | 0.6318 |
1.8824 | 3.7830 | 2580 | 1.3078 | 0.8686 | 0.3597 | 0.6446 | 0.6312 |
1.4881 | 3.8270 | 2610 | 1.6678 | 1.0378 | 0.1834 | 0.6427 | 0.6292 |
1.2663 | 3.8710 | 2640 | 2.0540 | 1.1969 | -0.0057 | 0.6404 | 0.6242 |
0.9128 | 3.9150 | 2670 | 1.2595 | 0.8179 | 0.3833 | 0.6438 | 0.6273 |
1.3534 | 3.9589 | 2700 | 1.3228 | 0.8648 | 0.3523 | 0.6383 | 0.6224 |
0.3248 | 4.0029 | 2730 | 1.6017 | 0.9971 | 0.2157 | 0.6424 | 0.6260 |
0.4408 | 4.0469 | 2760 | 1.2523 | 0.8347 | 0.3868 | 0.6474 | 0.6290 |
0.6593 | 4.0909 | 2790 | 1.2593 | 0.8396 | 0.3834 | 0.6453 | 0.6277 |
0.5935 | 4.1349 | 2820 | 1.3069 | 0.8725 | 0.3601 | 0.6438 | 0.6277 |
0.5308 | 4.1789 | 2850 | 1.2745 | 0.8521 | 0.3760 | 0.6449 | 0.6290 |
0.94 | 4.2229 | 2880 | 1.3047 | 0.8737 | 0.3612 | 0.6448 | 0.6289 |
0.6516 | 4.2669 | 2910 | 1.4950 | 0.9587 | 0.2680 | 0.6452 | 0.6315 |
0.1789 | 4.3109 | 2940 | 1.3578 | 0.8991 | 0.3352 | 0.6453 | 0.6288 |
0.5594 | 4.3548 | 2970 | 1.4207 | 0.9304 | 0.3044 | 0.6458 | 0.6298 |
0.3357 | 4.3988 | 3000 | 1.5353 | 0.9849 | 0.2483 | 0.6452 | 0.6282 |
0.1883 | 4.4428 | 3030 | 1.4177 | 0.9274 | 0.3059 | 0.6483 | 0.6326 |
0.3584 | 4.4868 | 3060 | 1.3492 | 0.8908 | 0.3394 | 0.6498 | 0.6348 |
0.51 | 4.5308 | 3090 | 1.3724 | 0.9032 | 0.3280 | 0.6479 | 0.6324 |
0.2909 | 4.5748 | 3120 | 1.3617 | 0.8998 | 0.3333 | 0.6460 | 0.6302 |
0.4247 | 4.6188 | 3150 | 1.3533 | 0.8985 | 0.3374 | 0.6485 | 0.6334 |
0.5367 | 4.6628 | 3180 | 1.3397 | 0.8856 | 0.3441 | 0.6456 | 0.6312 |
0.4184 | 4.7067 | 3210 | 1.3487 | 0.8928 | 0.3396 | 0.6458 | 0.6306 |
0.2521 | 4.7507 | 3240 | 1.3022 | 0.8580 | 0.3624 | 0.6462 | 0.6307 |
0.2434 | 4.7947 | 3270 | 1.5001 | 0.9638 | 0.2655 | 0.6450 | 0.6305 |
0.2547 | 4.8387 | 3300 | 1.3812 | 0.9053 | 0.3237 | 0.6452 | 0.6300 |
0.9901 | 4.8827 | 3330 | 1.4053 | 0.9147 | 0.3119 | 0.6449 | 0.6292 |
0.1669 | 4.9267 | 3360 | 1.3150 | 0.8729 | 0.3561 | 0.6473 | 0.6319 |
0.3208 | 4.9707 | 3390 | 1.2870 | 0.8580 | 0.3698 | 0.6495 | 0.6346 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.4.1+cu124
- Datasets 3.3.2
- Tokenizers 0.21.0