Aratako's picture
Aratako/reward-test-modernbert-310m
b68a910 verified
|
raw
history blame
12.3 kB
metadata
library_name: transformers
license: mit
base_model: sbintuitions/modernbert-ja-310m
tags:
  - generated_from_trainer
metrics:
  - pearsonr
  - spearmanr
model-index:
  - name: test-clf-modernbert-310m
    results: []

test-clf-modernbert-310m

This model is a fine-tuned version of sbintuitions/modernbert-ja-310m on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0892
  • Mae: 0.7812
  • R2: 0.3990
  • Pearsonr: 0.6383
  • Spearmanr: 0.6272

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine_with_min_lr
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Mae R2 Pearsonr Spearmanr
38.8642 0.0440 30 7.6612 2.1780 -2.7511 0.0043 -0.0027
10.4776 0.0880 60 2.3989 1.1914 -0.1745 0.2921 0.2652
3.6459 0.1320 90 2.3098 1.2163 -0.1310 0.4206 0.4163
5.807 0.1760 120 2.0340 1.1310 0.0041 0.4436 0.4267
5.2356 0.2199 150 1.9798 1.0620 0.0306 0.4974 0.4648
4.4142 0.2639 180 1.7722 1.0213 0.1323 0.5154 0.4874
4.9085 0.3079 210 6.8826 2.3366 -2.3699 0.5232 0.5094
8.0726 0.3519 240 1.4623 0.9226 0.2840 0.5386 0.5242
29.2783 0.3959 270 4.0754 1.6163 -0.9954 0.4770 0.4772
5.8973 0.4399 300 3.0100 1.3312 -0.4738 0.5204 0.5089
3.3493 0.4839 330 1.4475 0.8710 0.2913 0.5574 0.5520
6.7682 0.5279 360 1.3851 0.8808 0.3218 0.5715 0.5572
4.3158 0.5718 390 2.2720 1.2907 -0.1124 0.5504 0.5330
15.823 0.6158 420 4.1442 1.7054 -1.0291 0.5797 0.5572
8.0344 0.6598 450 2.7629 1.3644 -0.3528 0.5669 0.5553
3.171 0.7038 480 2.0582 1.1288 -0.0078 0.5012 0.5405
6.7538 0.7478 510 1.6033 1.0240 0.2150 0.5934 0.5790
6.1151 0.7918 540 1.6594 1.0670 0.1875 0.5697 0.5401
2.5472 0.8358 570 1.7069 1.0674 0.1643 0.5920 0.5763
3.9392 0.8798 600 2.1113 1.2292 -0.0337 0.5871 0.5767
3.9147 0.9238 630 1.3118 0.8620 0.3577 0.6115 0.5876
5.7769 0.9677 660 2.6878 1.2831 -0.3160 0.5764 0.5612
5.0716 1.0117 690 2.1133 1.0984 -0.0347 0.6059 0.5900
4.7065 1.0557 720 2.6758 1.3888 -0.3101 0.6116 0.5974
1.6835 1.0997 750 1.2992 0.8625 0.3639 0.6090 0.5826
5.2112 1.1437 780 1.8851 1.1165 0.0770 0.5974 0.5784
2.7997 1.1877 810 1.4227 0.9234 0.3034 0.6077 0.5810
1.9417 1.2317 840 1.5027 0.9326 0.2642 0.6310 0.6065
2.8662 1.2757 870 1.3368 0.8925 0.3454 0.6140 0.5774
4.2357 1.3196 900 2.6313 1.4141 -0.2883 0.6385 0.6103
7.8053 1.3636 930 1.6020 0.9218 0.2156 0.6347 0.6080
1.1231 1.4076 960 1.4656 0.9488 0.2824 0.6385 0.6122
5.6334 1.4516 990 1.3516 0.9137 0.3382 0.6426 0.6221
4.371 1.4956 1020 2.6421 1.4260 -0.2937 0.6369 0.6152
3.9286 1.5396 1050 1.4988 0.9515 0.2661 0.6398 0.6191
2.2357 1.5836 1080 1.3611 0.9070 0.3336 0.6290 0.6100
7.9489 1.6276 1110 1.2121 0.8059 0.4065 0.6418 0.6175
6.065 1.6716 1140 1.2714 0.8813 0.3775 0.6513 0.6241
2.1338 1.7155 1170 1.2413 0.8370 0.3922 0.6338 0.6065
2.5689 1.7595 1200 1.7681 1.0914 0.1343 0.6437 0.6228
1.4487 1.8035 1230 1.9605 1.1252 0.0401 0.6136 0.5836
2.2018 1.8475 1260 2.9671 1.5227 -0.4528 0.6329 0.6100
2.8964 1.8915 1290 1.6779 1.0542 0.1784 0.6384 0.6163
2.1872 1.9355 1320 1.2393 0.8072 0.3932 0.6459 0.6272
3.2919 1.9795 1350 2.7018 1.4239 -0.3229 0.6401 0.6227
2.5316 2.0235 1380 1.3240 0.8902 0.3517 0.6484 0.6285
2.0354 2.0674 1410 1.4146 0.9048 0.3074 0.6344 0.6130
2.9549 2.1114 1440 1.2957 0.8381 0.3656 0.6393 0.6228
3.5482 2.1554 1470 1.2744 0.8478 0.3760 0.6287 0.6077
2.3728 2.1994 1500 1.6528 1.0318 0.1907 0.6351 0.6166
2.9036 2.2434 1530 1.6116 1.0098 0.2109 0.6387 0.6141
2.4741 2.2874 1560 1.5921 1.0000 0.2204 0.6528 0.6346
1.3401 2.3314 1590 1.2849 0.8326 0.3709 0.6425 0.6294
2.1981 2.3754 1620 2.0894 1.1972 -0.0230 0.6428 0.6306
3.6077 2.4194 1650 1.2730 0.8461 0.3767 0.6411 0.6263
1.2494 2.4633 1680 1.3331 0.8805 0.3473 0.6520 0.6388
1.6448 2.5073 1710 1.8776 1.1258 0.0807 0.6539 0.6358
1.6004 2.5513 1740 1.6464 1.0332 0.1939 0.6457 0.6231
2.6825 2.5953 1770 1.2436 0.8325 0.3911 0.6517 0.6305
4.1015 2.6393 1800 1.8048 1.1235 0.1163 0.6490 0.6281
2.3947 2.6833 1830 2.1353 1.2060 -0.0455 0.6513 0.6283
3.6517 2.7273 1860 2.2012 1.2143 -0.0778 0.6511 0.6259
1.283 2.7713 1890 1.4102 0.9454 0.3095 0.6475 0.6209
3.372 2.8152 1920 1.2497 0.8385 0.3881 0.6544 0.6310
0.9015 2.8592 1950 1.5059 0.9694 0.2627 0.6439 0.6275
2.1263 2.9032 1980 1.2574 0.8277 0.3844 0.6561 0.6392
1.7678 2.9472 2010 1.2511 0.8340 0.3874 0.6547 0.6378
0.8637 2.9912 2040 1.3555 0.8935 0.3363 0.6452 0.6275
1.1866 3.0352 2070 1.2389 0.8230 0.3934 0.6519 0.6355
1.521 3.0792 2100 1.3950 0.9128 0.3170 0.6416 0.6268
1.3431 3.1232 2130 1.3883 0.9162 0.3203 0.6406 0.6282
1.6443 3.1672 2160 1.2446 0.8213 0.3906 0.6430 0.6284
2.2007 3.2111 2190 1.4758 0.9392 0.2774 0.6456 0.6316
1.24 3.2551 2220 1.5468 0.9892 0.2426 0.6458 0.6308
0.7113 3.2991 2250 1.2618 0.8316 0.3822 0.6454 0.6275
1.9999 3.3431 2280 1.6327 1.0221 0.2006 0.6493 0.6304
0.4573 3.3871 2310 1.2183 0.8150 0.4035 0.6497 0.6301
0.1997 3.4311 2340 1.2584 0.8476 0.3838 0.6401 0.6219
0.6893 3.4751 2370 1.3907 0.9077 0.3191 0.6507 0.6344
2.5815 3.5191 2400 1.5668 0.9990 0.2329 0.6503 0.6342
0.5047 3.5630 2430 1.2605 0.8514 0.3828 0.6490 0.6313
0.6636 3.6070 2460 1.4618 0.9461 0.2843 0.6492 0.6363
0.6637 3.6510 2490 1.4765 0.9607 0.2770 0.6476 0.6356
0.9363 3.6950 2520 1.2501 0.8259 0.3879 0.6498 0.6337
0.7925 3.7390 2550 1.3660 0.8958 0.3312 0.6462 0.6318
1.8824 3.7830 2580 1.3078 0.8686 0.3597 0.6446 0.6312
1.4881 3.8270 2610 1.6678 1.0378 0.1834 0.6427 0.6292
1.2663 3.8710 2640 2.0540 1.1969 -0.0057 0.6404 0.6242
0.9128 3.9150 2670 1.2595 0.8179 0.3833 0.6438 0.6273
1.3534 3.9589 2700 1.3228 0.8648 0.3523 0.6383 0.6224
0.3248 4.0029 2730 1.6017 0.9971 0.2157 0.6424 0.6260
0.4408 4.0469 2760 1.2523 0.8347 0.3868 0.6474 0.6290
0.6593 4.0909 2790 1.2593 0.8396 0.3834 0.6453 0.6277
0.5935 4.1349 2820 1.3069 0.8725 0.3601 0.6438 0.6277
0.5308 4.1789 2850 1.2745 0.8521 0.3760 0.6449 0.6290
0.94 4.2229 2880 1.3047 0.8737 0.3612 0.6448 0.6289
0.6516 4.2669 2910 1.4950 0.9587 0.2680 0.6452 0.6315
0.1789 4.3109 2940 1.3578 0.8991 0.3352 0.6453 0.6288
0.5594 4.3548 2970 1.4207 0.9304 0.3044 0.6458 0.6298
0.3357 4.3988 3000 1.5353 0.9849 0.2483 0.6452 0.6282
0.1883 4.4428 3030 1.4177 0.9274 0.3059 0.6483 0.6326
0.3584 4.4868 3060 1.3492 0.8908 0.3394 0.6498 0.6348
0.51 4.5308 3090 1.3724 0.9032 0.3280 0.6479 0.6324
0.2909 4.5748 3120 1.3617 0.8998 0.3333 0.6460 0.6302
0.4247 4.6188 3150 1.3533 0.8985 0.3374 0.6485 0.6334
0.5367 4.6628 3180 1.3397 0.8856 0.3441 0.6456 0.6312
0.4184 4.7067 3210 1.3487 0.8928 0.3396 0.6458 0.6306
0.2521 4.7507 3240 1.3022 0.8580 0.3624 0.6462 0.6307
0.2434 4.7947 3270 1.5001 0.9638 0.2655 0.6450 0.6305
0.2547 4.8387 3300 1.3812 0.9053 0.3237 0.6452 0.6300
0.9901 4.8827 3330 1.4053 0.9147 0.3119 0.6449 0.6292
0.1669 4.9267 3360 1.3150 0.8729 0.3561 0.6473 0.6319
0.3208 4.9707 3390 1.2870 0.8580 0.3698 0.6495 0.6346

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0