gemma-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_135828
This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:
- Loss: 0.2626
- Precision: 0.6513
- Recall: 0.5985
- F1 Macro: 0.6074
- Accuracy: 0.7401
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 32
- seed: 0
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 20
Training results
Training Loss | Epoch | Step | Validation Loss | Precision | Recall | F1 Macro | Accuracy |
---|---|---|---|---|---|---|---|
No log | 0 | 0 | 5.3215 | 0.0433 | 0.25 | 0.0737 | 0.1730 |
0.2231 | 0.1954 | 1000 | 0.2100 | 0.5533 | 0.5211 | 0.5332 | 0.7474 |
0.2297 | 0.3908 | 2000 | 0.2306 | 0.5656 | 0.4913 | 0.4884 | 0.7059 |
0.2044 | 0.5862 | 3000 | 0.2403 | 0.5660 | 0.4995 | 0.4964 | 0.7056 |
0.2277 | 0.7816 | 4000 | 0.1904 | 0.7714 | 0.5493 | 0.5609 | 0.7602 |
0.2007 | 0.9769 | 5000 | 0.1868 | 0.5619 | 0.5484 | 0.5500 | 0.7554 |
0.1698 | 1.1723 | 6000 | 0.2103 | 0.6577 | 0.6027 | 0.5986 | 0.7503 |
0.172 | 1.3677 | 7000 | 0.1946 | 0.5552 | 0.5522 | 0.5524 | 0.7581 |
0.1664 | 1.5631 | 8000 | 0.2038 | 0.5294 | 0.5785 | 0.5502 | 0.7417 |
0.1779 | 1.7585 | 9000 | 0.1916 | 0.6656 | 0.5889 | 0.5988 | 0.7583 |
0.1723 | 1.9539 | 10000 | 0.1867 | 0.7232 | 0.5817 | 0.5931 | 0.7641 |
0.1016 | 2.1493 | 11000 | 0.2077 | 0.7968 | 0.5421 | 0.5452 | 0.7499 |
0.1015 | 2.3447 | 12000 | 0.2390 | 0.6151 | 0.6351 | 0.6241 | 0.7350 |
0.1058 | 2.5401 | 13000 | 0.2189 | 0.6452 | 0.5951 | 0.6120 | 0.7461 |
0.109 | 2.7354 | 14000 | 0.2606 | 0.6620 | 0.5765 | 0.5904 | 0.7174 |
0.1054 | 2.9308 | 15000 | 0.2179 | 0.6465 | 0.5899 | 0.6009 | 0.7384 |
0.0624 | 3.1262 | 16000 | 0.2361 | 0.6394 | 0.5840 | 0.5927 | 0.7358 |
0.0615 | 3.3216 | 17000 | 0.2616 | 0.6494 | 0.5637 | 0.5829 | 0.7109 |
0.0768 | 3.5170 | 18000 | 0.2375 | 0.6464 | 0.5662 | 0.5911 | 0.7385 |
0.0631 | 3.7124 | 19000 | 0.2551 | 0.6190 | 0.6100 | 0.6104 | 0.7263 |
0.0691 | 3.9078 | 20000 | 0.2475 | 0.6438 | 0.5896 | 0.6018 | 0.7248 |
0.0462 | 4.1032 | 21000 | 0.2942 | 0.6375 | 0.5898 | 0.5901 | 0.6980 |
0.0468 | 4.2986 | 22000 | 0.2548 | 0.6522 | 0.5877 | 0.5971 | 0.7290 |
0.0518 | 4.4939 | 23000 | 0.2544 | 0.6261 | 0.5922 | 0.5959 | 0.7318 |
0.043 | 4.6893 | 24000 | 0.2512 | 0.6225 | 0.5912 | 0.5981 | 0.7298 |
0.0556 | 4.8847 | 25000 | 0.2446 | 0.6125 | 0.5965 | 0.6029 | 0.7339 |
0.0322 | 5.0801 | 26000 | 0.2861 | 0.6231 | 0.6128 | 0.6122 | 0.7225 |
0.0446 | 5.2755 | 27000 | 0.2399 | 0.6401 | 0.5812 | 0.5989 | 0.7397 |
0.0391 | 5.4709 | 28000 | 0.2744 | 0.6371 | 0.5929 | 0.6028 | 0.7204 |
0.0387 | 5.6663 | 29000 | 0.2498 | 0.6340 | 0.5671 | 0.5876 | 0.7370 |
0.0368 | 5.8617 | 30000 | 0.2503 | 0.6565 | 0.5890 | 0.5983 | 0.7349 |
0.0244 | 6.0571 | 31000 | 0.3083 | 0.5972 | 0.5978 | 0.5869 | 0.6873 |
0.024 | 6.2524 | 32000 | 0.2607 | 0.6283 | 0.6002 | 0.6074 | 0.7217 |
0.0271 | 6.4478 | 33000 | 0.2489 | 0.6461 | 0.5800 | 0.5922 | 0.7371 |
0.0235 | 6.6432 | 34000 | 0.2582 | 0.6350 | 0.5961 | 0.6047 | 0.7221 |
0.0224 | 6.8386 | 35000 | 0.2660 | 0.6211 | 0.5978 | 0.6048 | 0.7238 |
0.0156 | 7.0340 | 36000 | 0.2567 | 0.6686 | 0.5831 | 0.5865 | 0.7394 |
0.0186 | 7.2294 | 37000 | 0.2675 | 0.6115 | 0.6055 | 0.6041 | 0.7247 |
0.0202 | 7.4248 | 38000 | 0.2697 | 0.6234 | 0.5890 | 0.5975 | 0.7188 |
0.0192 | 7.6202 | 39000 | 0.2708 | 0.6097 | 0.6224 | 0.6137 | 0.7242 |
0.0192 | 7.8156 | 40000 | 0.2645 | 0.6147 | 0.6096 | 0.6098 | 0.7281 |
0.0206 | 8.0109 | 41000 | 0.2659 | 0.6197 | 0.6013 | 0.5995 | 0.7254 |
0.0176 | 8.2063 | 42000 | 0.2650 | 0.6327 | 0.5964 | 0.6004 | 0.7191 |
0.0163 | 8.4017 | 43000 | 0.2557 | 0.6793 | 0.5860 | 0.5959 | 0.7347 |
0.0211 | 8.5971 | 44000 | 0.2563 | 0.6486 | 0.6054 | 0.6075 | 0.7357 |
0.0135 | 8.7925 | 45000 | 0.2618 | 0.6867 | 0.5745 | 0.5789 | 0.7397 |
0.0169 | 8.9879 | 46000 | 0.2635 | 0.6209 | 0.5991 | 0.6041 | 0.7338 |
0.0114 | 9.1833 | 47000 | 0.2571 | 0.6272 | 0.5987 | 0.6063 | 0.7363 |
0.0122 | 9.3787 | 48000 | 0.2567 | 0.6300 | 0.5884 | 0.5973 | 0.7359 |
0.0136 | 9.5741 | 49000 | 0.2595 | 0.6874 | 0.5439 | 0.5595 | 0.7436 |
0.0124 | 9.7694 | 50000 | 0.2607 | 0.6353 | 0.5934 | 0.5980 | 0.7307 |
0.0124 | 9.9648 | 51000 | 0.2485 | 0.6560 | 0.5905 | 0.6003 | 0.7446 |
0.0132 | 10.1602 | 52000 | 0.2623 | 0.6437 | 0.5859 | 0.5871 | 0.7294 |
0.0134 | 10.3556 | 53000 | 0.2516 | 0.6420 | 0.5833 | 0.5991 | 0.7464 |
0.0151 | 10.5510 | 54000 | 0.2632 | 0.6358 | 0.5842 | 0.5926 | 0.7320 |
0.0074 | 10.7464 | 55000 | 0.2582 | 0.6711 | 0.5888 | 0.5987 | 0.7383 |
0.0078 | 10.9418 | 56000 | 0.2672 | 0.6205 | 0.5981 | 0.6006 | 0.7253 |
0.0093 | 11.1372 | 57000 | 0.2681 | 0.6323 | 0.5827 | 0.5884 | 0.7284 |
0.009 | 11.3326 | 58000 | 0.2699 | 0.6433 | 0.5984 | 0.5996 | 0.7220 |
0.0077 | 11.5279 | 59000 | 0.2542 | 0.6725 | 0.5778 | 0.5863 | 0.7420 |
0.0098 | 11.7233 | 60000 | 0.2520 | 0.6838 | 0.5810 | 0.5876 | 0.7468 |
0.0085 | 11.9187 | 61000 | 0.2738 | 0.6212 | 0.6099 | 0.6090 | 0.7157 |
0.008 | 12.1141 | 62000 | 0.2570 | 0.6576 | 0.5832 | 0.5956 | 0.7366 |
0.0061 | 12.3095 | 63000 | 0.2609 | 0.6466 | 0.5962 | 0.6067 | 0.7386 |
0.0077 | 12.5049 | 64000 | 0.2675 | 0.6180 | 0.5976 | 0.5956 | 0.7317 |
0.0062 | 12.7003 | 65000 | 0.2627 | 0.6233 | 0.6045 | 0.6066 | 0.7348 |
0.0041 | 12.8957 | 66000 | 0.2620 | 0.6568 | 0.5705 | 0.5855 | 0.7403 |
0.0016 | 13.0911 | 67000 | 0.2597 | 0.6470 | 0.5811 | 0.5912 | 0.7369 |
0.0085 | 13.2864 | 68000 | 0.2531 | 0.6584 | 0.5801 | 0.5928 | 0.7442 |
0.0147 | 13.4818 | 69000 | 0.2659 | 0.6274 | 0.6133 | 0.6131 | 0.7248 |
0.0029 | 13.6772 | 70000 | 0.2727 | 0.6073 | 0.6093 | 0.6024 | 0.7230 |
0.0034 | 13.8726 | 71000 | 0.2622 | 0.6518 | 0.5709 | 0.5737 | 0.7354 |
0.001 | 14.0680 | 72000 | 0.2635 | 0.6441 | 0.5877 | 0.6050 | 0.7388 |
0.0033 | 14.2634 | 73000 | 0.2684 | 0.6366 | 0.6015 | 0.6081 | 0.7327 |
0.003 | 14.4588 | 74000 | 0.2651 | 0.6401 | 0.6052 | 0.6126 | 0.7333 |
0.0038 | 14.6542 | 75000 | 0.2549 | 0.6932 | 0.5614 | 0.5727 | 0.7425 |
0.0059 | 14.8496 | 76000 | 0.2692 | 0.6336 | 0.5904 | 0.5981 | 0.7297 |
0.0022 | 15.0449 | 77000 | 0.2554 | 0.6621 | 0.5822 | 0.6015 | 0.7469 |
0.0036 | 15.2403 | 78000 | 0.2571 | 0.6655 | 0.5893 | 0.6065 | 0.7402 |
0.0028 | 15.4357 | 79000 | 0.2563 | 0.6760 | 0.5735 | 0.5879 | 0.7446 |
0.0035 | 15.6311 | 80000 | 0.2561 | 0.6728 | 0.5869 | 0.6045 | 0.7461 |
0.0017 | 15.8265 | 81000 | 0.2708 | 0.6633 | 0.5967 | 0.6072 | 0.7275 |
0.0017 | 16.0219 | 82000 | 0.2550 | 0.6593 | 0.5961 | 0.6122 | 0.7440 |
0.0017 | 16.2173 | 83000 | 0.2552 | 0.6683 | 0.5789 | 0.5885 | 0.7439 |
0.002 | 16.4127 | 84000 | 0.2660 | 0.6517 | 0.6071 | 0.6135 | 0.7348 |
0.0009 | 16.6081 | 85000 | 0.2513 | 0.6724 | 0.5848 | 0.6025 | 0.7503 |
0.0013 | 16.8034 | 86000 | 0.2600 | 0.6563 | 0.6050 | 0.6143 | 0.7407 |
0.0028 | 16.9988 | 87000 | 0.2616 | 0.6503 | 0.5987 | 0.6099 | 0.7388 |
0.0 | 17.1942 | 88000 | 0.2726 | 0.6566 | 0.6006 | 0.6033 | 0.7260 |
0.0013 | 17.3896 | 89000 | 0.2617 | 0.6578 | 0.5810 | 0.6004 | 0.7415 |
0.0001 | 17.5850 | 90000 | 0.2856 | 0.6078 | 0.6094 | 0.6041 | 0.7181 |
0.0012 | 17.7804 | 91000 | 0.2588 | 0.6469 | 0.5774 | 0.5831 | 0.7424 |
0.0013 | 17.9758 | 92000 | 0.2591 | 0.6660 | 0.6029 | 0.6149 | 0.7426 |
0.0014 | 18.1712 | 93000 | 0.2777 | 0.6420 | 0.6030 | 0.6084 | 0.7225 |
0.0019 | 18.3665 | 94000 | 0.2609 | 0.6631 | 0.5918 | 0.6042 | 0.7408 |
0.0025 | 18.5619 | 95000 | 0.2630 | 0.6484 | 0.5954 | 0.6057 | 0.7398 |
0.0019 | 18.7573 | 96000 | 0.2620 | 0.6486 | 0.5984 | 0.6092 | 0.7386 |
0.002 | 18.9527 | 97000 | 0.2575 | 0.6566 | 0.5864 | 0.5910 | 0.7448 |
0.0006 | 19.1481 | 98000 | 0.2590 | 0.6534 | 0.5976 | 0.6084 | 0.7432 |
0.0013 | 19.3435 | 99000 | 0.2558 | 0.6584 | 0.5905 | 0.6013 | 0.7463 |
0.0 | 19.5389 | 100000 | 0.2624 | 0.6542 | 0.5980 | 0.6083 | 0.7398 |
0.0006 | 19.7343 | 101000 | 0.2641 | 0.6449 | 0.6017 | 0.6089 | 0.7382 |
0.0 | 19.9297 | 102000 | 0.2626 | 0.6513 | 0.5985 | 0.6074 | 0.7401 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.3.2
- Tokenizers 0.21.1
- Downloads last month
- 25
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for whoisjones/gemma-fineweb-edu-scorer-mdeberta-multilabel-lr5e-05-20250411_135828
Base model
microsoft/mdeberta-v3-base