gemma-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_135828

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2626
  • Precision: 0.6513
  • Recall: 0.5985
  • F1 Macro: 0.6074
  • Accuracy: 0.7401

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 0
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Macro Accuracy
No log 0 0 5.3215 0.0433 0.25 0.0737 0.1730
0.2231 0.1954 1000 0.2100 0.5533 0.5211 0.5332 0.7474
0.2297 0.3908 2000 0.2306 0.5656 0.4913 0.4884 0.7059
0.2044 0.5862 3000 0.2403 0.5660 0.4995 0.4964 0.7056
0.2277 0.7816 4000 0.1904 0.7714 0.5493 0.5609 0.7602
0.2007 0.9769 5000 0.1868 0.5619 0.5484 0.5500 0.7554
0.1698 1.1723 6000 0.2103 0.6577 0.6027 0.5986 0.7503
0.172 1.3677 7000 0.1946 0.5552 0.5522 0.5524 0.7581
0.1664 1.5631 8000 0.2038 0.5294 0.5785 0.5502 0.7417
0.1779 1.7585 9000 0.1916 0.6656 0.5889 0.5988 0.7583
0.1723 1.9539 10000 0.1867 0.7232 0.5817 0.5931 0.7641
0.1016 2.1493 11000 0.2077 0.7968 0.5421 0.5452 0.7499
0.1015 2.3447 12000 0.2390 0.6151 0.6351 0.6241 0.7350
0.1058 2.5401 13000 0.2189 0.6452 0.5951 0.6120 0.7461
0.109 2.7354 14000 0.2606 0.6620 0.5765 0.5904 0.7174
0.1054 2.9308 15000 0.2179 0.6465 0.5899 0.6009 0.7384
0.0624 3.1262 16000 0.2361 0.6394 0.5840 0.5927 0.7358
0.0615 3.3216 17000 0.2616 0.6494 0.5637 0.5829 0.7109
0.0768 3.5170 18000 0.2375 0.6464 0.5662 0.5911 0.7385
0.0631 3.7124 19000 0.2551 0.6190 0.6100 0.6104 0.7263
0.0691 3.9078 20000 0.2475 0.6438 0.5896 0.6018 0.7248
0.0462 4.1032 21000 0.2942 0.6375 0.5898 0.5901 0.6980
0.0468 4.2986 22000 0.2548 0.6522 0.5877 0.5971 0.7290
0.0518 4.4939 23000 0.2544 0.6261 0.5922 0.5959 0.7318
0.043 4.6893 24000 0.2512 0.6225 0.5912 0.5981 0.7298
0.0556 4.8847 25000 0.2446 0.6125 0.5965 0.6029 0.7339
0.0322 5.0801 26000 0.2861 0.6231 0.6128 0.6122 0.7225
0.0446 5.2755 27000 0.2399 0.6401 0.5812 0.5989 0.7397
0.0391 5.4709 28000 0.2744 0.6371 0.5929 0.6028 0.7204
0.0387 5.6663 29000 0.2498 0.6340 0.5671 0.5876 0.7370
0.0368 5.8617 30000 0.2503 0.6565 0.5890 0.5983 0.7349
0.0244 6.0571 31000 0.3083 0.5972 0.5978 0.5869 0.6873
0.024 6.2524 32000 0.2607 0.6283 0.6002 0.6074 0.7217
0.0271 6.4478 33000 0.2489 0.6461 0.5800 0.5922 0.7371
0.0235 6.6432 34000 0.2582 0.6350 0.5961 0.6047 0.7221
0.0224 6.8386 35000 0.2660 0.6211 0.5978 0.6048 0.7238
0.0156 7.0340 36000 0.2567 0.6686 0.5831 0.5865 0.7394
0.0186 7.2294 37000 0.2675 0.6115 0.6055 0.6041 0.7247
0.0202 7.4248 38000 0.2697 0.6234 0.5890 0.5975 0.7188
0.0192 7.6202 39000 0.2708 0.6097 0.6224 0.6137 0.7242
0.0192 7.8156 40000 0.2645 0.6147 0.6096 0.6098 0.7281
0.0206 8.0109 41000 0.2659 0.6197 0.6013 0.5995 0.7254
0.0176 8.2063 42000 0.2650 0.6327 0.5964 0.6004 0.7191
0.0163 8.4017 43000 0.2557 0.6793 0.5860 0.5959 0.7347
0.0211 8.5971 44000 0.2563 0.6486 0.6054 0.6075 0.7357
0.0135 8.7925 45000 0.2618 0.6867 0.5745 0.5789 0.7397
0.0169 8.9879 46000 0.2635 0.6209 0.5991 0.6041 0.7338
0.0114 9.1833 47000 0.2571 0.6272 0.5987 0.6063 0.7363
0.0122 9.3787 48000 0.2567 0.6300 0.5884 0.5973 0.7359
0.0136 9.5741 49000 0.2595 0.6874 0.5439 0.5595 0.7436
0.0124 9.7694 50000 0.2607 0.6353 0.5934 0.5980 0.7307
0.0124 9.9648 51000 0.2485 0.6560 0.5905 0.6003 0.7446
0.0132 10.1602 52000 0.2623 0.6437 0.5859 0.5871 0.7294
0.0134 10.3556 53000 0.2516 0.6420 0.5833 0.5991 0.7464
0.0151 10.5510 54000 0.2632 0.6358 0.5842 0.5926 0.7320
0.0074 10.7464 55000 0.2582 0.6711 0.5888 0.5987 0.7383
0.0078 10.9418 56000 0.2672 0.6205 0.5981 0.6006 0.7253
0.0093 11.1372 57000 0.2681 0.6323 0.5827 0.5884 0.7284
0.009 11.3326 58000 0.2699 0.6433 0.5984 0.5996 0.7220
0.0077 11.5279 59000 0.2542 0.6725 0.5778 0.5863 0.7420
0.0098 11.7233 60000 0.2520 0.6838 0.5810 0.5876 0.7468
0.0085 11.9187 61000 0.2738 0.6212 0.6099 0.6090 0.7157
0.008 12.1141 62000 0.2570 0.6576 0.5832 0.5956 0.7366
0.0061 12.3095 63000 0.2609 0.6466 0.5962 0.6067 0.7386
0.0077 12.5049 64000 0.2675 0.6180 0.5976 0.5956 0.7317
0.0062 12.7003 65000 0.2627 0.6233 0.6045 0.6066 0.7348
0.0041 12.8957 66000 0.2620 0.6568 0.5705 0.5855 0.7403
0.0016 13.0911 67000 0.2597 0.6470 0.5811 0.5912 0.7369
0.0085 13.2864 68000 0.2531 0.6584 0.5801 0.5928 0.7442
0.0147 13.4818 69000 0.2659 0.6274 0.6133 0.6131 0.7248
0.0029 13.6772 70000 0.2727 0.6073 0.6093 0.6024 0.7230
0.0034 13.8726 71000 0.2622 0.6518 0.5709 0.5737 0.7354
0.001 14.0680 72000 0.2635 0.6441 0.5877 0.6050 0.7388
0.0033 14.2634 73000 0.2684 0.6366 0.6015 0.6081 0.7327
0.003 14.4588 74000 0.2651 0.6401 0.6052 0.6126 0.7333
0.0038 14.6542 75000 0.2549 0.6932 0.5614 0.5727 0.7425
0.0059 14.8496 76000 0.2692 0.6336 0.5904 0.5981 0.7297
0.0022 15.0449 77000 0.2554 0.6621 0.5822 0.6015 0.7469
0.0036 15.2403 78000 0.2571 0.6655 0.5893 0.6065 0.7402
0.0028 15.4357 79000 0.2563 0.6760 0.5735 0.5879 0.7446
0.0035 15.6311 80000 0.2561 0.6728 0.5869 0.6045 0.7461
0.0017 15.8265 81000 0.2708 0.6633 0.5967 0.6072 0.7275
0.0017 16.0219 82000 0.2550 0.6593 0.5961 0.6122 0.7440
0.0017 16.2173 83000 0.2552 0.6683 0.5789 0.5885 0.7439
0.002 16.4127 84000 0.2660 0.6517 0.6071 0.6135 0.7348
0.0009 16.6081 85000 0.2513 0.6724 0.5848 0.6025 0.7503
0.0013 16.8034 86000 0.2600 0.6563 0.6050 0.6143 0.7407
0.0028 16.9988 87000 0.2616 0.6503 0.5987 0.6099 0.7388
0.0 17.1942 88000 0.2726 0.6566 0.6006 0.6033 0.7260
0.0013 17.3896 89000 0.2617 0.6578 0.5810 0.6004 0.7415
0.0001 17.5850 90000 0.2856 0.6078 0.6094 0.6041 0.7181
0.0012 17.7804 91000 0.2588 0.6469 0.5774 0.5831 0.7424
0.0013 17.9758 92000 0.2591 0.6660 0.6029 0.6149 0.7426
0.0014 18.1712 93000 0.2777 0.6420 0.6030 0.6084 0.7225
0.0019 18.3665 94000 0.2609 0.6631 0.5918 0.6042 0.7408
0.0025 18.5619 95000 0.2630 0.6484 0.5954 0.6057 0.7398
0.0019 18.7573 96000 0.2620 0.6486 0.5984 0.6092 0.7386
0.002 18.9527 97000 0.2575 0.6566 0.5864 0.5910 0.7448
0.0006 19.1481 98000 0.2590 0.6534 0.5976 0.6084 0.7432
0.0013 19.3435 99000 0.2558 0.6584 0.5905 0.6013 0.7463
0.0 19.5389 100000 0.2624 0.6542 0.5980 0.6083 0.7398
0.0006 19.7343 101000 0.2641 0.6449 0.6017 0.6089 0.7382
0.0 19.9297 102000 0.2626 0.6513 0.5985 0.6074 0.7401

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.1
Downloads last month
25
Safetensors
Model size
279M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for whoisjones/gemma-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_135828

Finetuned
(194)
this model