gemma-fineweb-edu-scorer-mdeberta-binary-lr5e-05-20250411_140230

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1614
  • Precision: 1.0
  • Recall: 1.0
  • F1 Macro: 1.0
  • Accuracy: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 0
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Precision Recall F1 Macro Accuracy
No log 0 0 0.2170 1.0 1.0 1.0 1.0
0.1238 0.1954 1000 0.1458 1.0 1.0 1.0 1.0
0.1143 0.3908 2000 0.1076 1.0 1.0 1.0 1.0
0.1208 0.5862 3000 0.1133 1.0 1.0 1.0 1.0
0.1031 0.7816 4000 0.1069 1.0 1.0 1.0 1.0
0.1045 0.9769 5000 0.1047 1.0 1.0 1.0 1.0
0.0974 1.1723 6000 0.1087 1.0 1.0 1.0 1.0
0.1337 1.3677 7000 0.1209 1.0 1.0 1.0 1.0
0.0864 1.5631 8000 0.1151 1.0 1.0 1.0 1.0
0.1049 1.7585 9000 0.1203 1.0 1.0 1.0 1.0
0.0915 1.9539 10000 0.1255 1.0 1.0 1.0 1.0
0.079 2.1493 11000 0.1369 1.0 1.0 1.0 1.0
0.0709 2.3447 12000 0.1167 1.0 1.0 1.0 1.0
0.0823 2.5401 13000 0.1257 1.0 1.0 1.0 1.0
0.0725 2.7354 14000 0.1114 1.0 1.0 1.0 1.0
0.066 2.9308 15000 0.1232 1.0 1.0 1.0 1.0
0.0354 3.1262 16000 0.1321 1.0 1.0 1.0 1.0
0.0486 3.3216 17000 0.1245 1.0 1.0 1.0 1.0
0.0466 3.5170 18000 0.1329 1.0 1.0 1.0 1.0
0.0523 3.7124 19000 0.1350 1.0 1.0 1.0 1.0
0.0614 3.9078 20000 0.1373 1.0 1.0 1.0 1.0
0.0349 4.1032 21000 0.1461 1.0 1.0 1.0 1.0
0.0315 4.2986 22000 0.1406 1.0 1.0 1.0 1.0
0.0337 4.4939 23000 0.1355 1.0 1.0 1.0 1.0
0.0345 4.6893 24000 0.1300 1.0 1.0 1.0 1.0
0.0324 4.8847 25000 0.1324 1.0 1.0 1.0 1.0
0.0247 5.0801 26000 0.1347 1.0 1.0 1.0 1.0
0.0306 5.2755 27000 0.1474 1.0 1.0 1.0 1.0
0.0267 5.4709 28000 0.1394 1.0 1.0 1.0 1.0
0.0204 5.6663 29000 0.1487 1.0 1.0 1.0 1.0
0.0266 5.8617 30000 0.1454 1.0 1.0 1.0 1.0
0.0188 6.0571 31000 0.1434 1.0 1.0 1.0 1.0
0.0192 6.2524 32000 0.1458 1.0 1.0 1.0 1.0
0.0246 6.4478 33000 0.1455 1.0 1.0 1.0 1.0
0.0219 6.6432 34000 0.1440 1.0 1.0 1.0 1.0
0.0231 6.8386 35000 0.1561 1.0 1.0 1.0 1.0
0.0185 7.0340 36000 0.1504 1.0 1.0 1.0 1.0
0.0163 7.2294 37000 0.1516 1.0 1.0 1.0 1.0
0.0186 7.4248 38000 0.1451 1.0 1.0 1.0 1.0
0.025 7.6202 39000 0.1423 1.0 1.0 1.0 1.0
0.0217 7.8156 40000 0.1453 1.0 1.0 1.0 1.0
0.0226 8.0109 41000 0.1584 1.0 1.0 1.0 1.0
0.0136 8.2063 42000 0.1589 1.0 1.0 1.0 1.0
0.0186 8.4017 43000 0.1518 1.0 1.0 1.0 1.0
0.0215 8.5971 44000 0.1459 1.0 1.0 1.0 1.0
0.0182 8.7925 45000 0.1437 1.0 1.0 1.0 1.0
0.0236 8.9879 46000 0.1409 1.0 1.0 1.0 1.0
0.0204 9.1833 47000 0.1514 1.0 1.0 1.0 1.0
0.0198 9.3787 48000 0.1617 1.0 1.0 1.0 1.0
0.0158 9.5741 49000 0.1366 1.0 1.0 1.0 1.0
0.0256 9.7694 50000 0.1450 1.0 1.0 1.0 1.0
0.0186 9.9648 51000 0.1525 1.0 1.0 1.0 1.0
0.0238 10.1602 52000 0.1658 1.0 1.0 1.0 1.0
0.0189 10.3556 53000 0.1442 1.0 1.0 1.0 1.0
0.0184 10.5510 54000 0.1495 1.0 1.0 1.0 1.0
0.0196 10.7464 55000 0.1428 1.0 1.0 1.0 1.0
0.0224 10.9418 56000 0.1606 1.0 1.0 1.0 1.0
0.018 11.1372 57000 0.1436 1.0 1.0 1.0 1.0
0.1813 11.3326 58000 0.1829 1.0 1.0 1.0 1.0
0.1888 11.5279 59000 0.1832 1.0 1.0 1.0 1.0
0.1826 11.7233 60000 0.1828 1.0 1.0 1.0 1.0
0.0227 11.9187 61000 0.1447 1.0 1.0 1.0 1.0
0.0162 12.1141 62000 0.1491 1.0 1.0 1.0 1.0
0.0212 12.3095 63000 0.1386 1.0 1.0 1.0 1.0
0.0124 12.5049 64000 0.1507 1.0 1.0 1.0 1.0
0.0162 12.7003 65000 0.1425 1.0 1.0 1.0 1.0
0.0122 12.8957 66000 0.1417 1.0 1.0 1.0 1.0
0.0105 13.0911 67000 0.1414 1.0 1.0 1.0 1.0
0.0114 13.2864 68000 0.1537 1.0 1.0 1.0 1.0
0.0083 13.4818 69000 0.1548 1.0 1.0 1.0 1.0
0.0084 13.6772 70000 0.1439 1.0 1.0 1.0 1.0
0.0113 13.8726 71000 0.1504 1.0 1.0 1.0 1.0
0.013 14.0680 72000 0.1480 1.0 1.0 1.0 1.0
0.0066 14.2634 73000 0.1544 1.0 1.0 1.0 1.0
0.0119 14.4588 74000 0.1509 1.0 1.0 1.0 1.0
0.0078 14.6542 75000 0.1546 1.0 1.0 1.0 1.0
0.0162 14.8496 76000 0.1513 1.0 1.0 1.0 1.0
0.0094 15.0449 77000 0.1571 1.0 1.0 1.0 1.0
0.0097 15.2403 78000 0.1646 1.0 1.0 1.0 1.0
0.0132 15.4357 79000 0.1505 1.0 1.0 1.0 1.0
0.0127 15.6311 80000 0.1539 1.0 1.0 1.0 1.0
0.0086 15.8265 81000 0.1572 1.0 1.0 1.0 1.0
0.0067 16.0219 82000 0.1583 1.0 1.0 1.0 1.0
0.007 16.2173 83000 0.1531 1.0 1.0 1.0 1.0
0.0117 16.4127 84000 0.1485 1.0 1.0 1.0 1.0
0.0156 16.6081 85000 0.1495 1.0 1.0 1.0 1.0
0.0089 16.8034 86000 0.1570 1.0 1.0 1.0 1.0
0.0075 16.9988 87000 0.1540 1.0 1.0 1.0 1.0
0.0068 17.1942 88000 0.1612 1.0 1.0 1.0 1.0
0.0074 17.3896 89000 0.1596 1.0 1.0 1.0 1.0
0.0117 17.5850 90000 0.1617 1.0 1.0 1.0 1.0
0.0116 17.7804 91000 0.1689 1.0 1.0 1.0 1.0
0.0064 17.9758 92000 0.1602 1.0 1.0 1.0 1.0
0.0079 18.1712 93000 0.1647 1.0 1.0 1.0 1.0
0.0051 18.3665 94000 0.1534 1.0 1.0 1.0 1.0
0.0069 18.5619 95000 0.1570 1.0 1.0 1.0 1.0
0.0062 18.7573 96000 0.1533 1.0 1.0 1.0 1.0
0.0048 18.9527 97000 0.1566 1.0 1.0 1.0 1.0
0.0142 19.1481 98000 0.1532 1.0 1.0 1.0 1.0
0.0147 19.3435 99000 0.1501 1.0 1.0 1.0 1.0
0.0085 19.5389 100000 0.1535 1.0 1.0 1.0 1.0
0.005 19.7343 101000 0.1599 1.0 1.0 1.0 1.0
0.0066 19.9297 102000 0.1614 1.0 1.0 1.0 1.0

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.1
Downloads last month
0
Safetensors
Model size
279M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for whoisjones/gemma-fineweb-edu-scorer-mdeberta-binary-lr5e-05-20250411_140230

Finetuned
(165)
this model