ST1_modernbert-base_hazard-category_V1

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7555
  • F1: 0.9462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 36
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 200

Training results

Training Loss Epoch Step Validation Loss F1
0.83 1.0 128 0.2556 0.9226
0.2639 2.0 256 0.2905 0.9053
0.17 3.0 384 0.2935 0.9388
0.0967 4.0 512 0.3408 0.9291
0.0549 5.0 640 0.3093 0.9442
0.0523 6.0 768 0.3185 0.9470
0.0312 7.0 896 0.4171 0.9407
0.0165 8.0 1024 0.4558 0.9469
0.0116 9.0 1152 0.4671 0.9359
0.0077 10.0 1280 0.4365 0.9511
0.0071 11.0 1408 0.5180 0.9387
0.008 12.0 1536 0.5229 0.9383
0.001 13.0 1664 0.5507 0.9378
0.0021 14.0 1792 0.5437 0.9427
0.0009 15.0 1920 0.5607 0.9466
0.0001 16.0 2048 0.5666 0.9445
0.0006 17.0 2176 0.5684 0.9405
0.0008 18.0 2304 0.5755 0.9446
0.0004 19.0 2432 0.5756 0.9385
0.0004 20.0 2560 0.5775 0.9425
0.0007 21.0 2688 0.5859 0.9389
0.0007 22.0 2816 0.5852 0.9385
0.0007 23.0 2944 0.5931 0.9427
0.0003 24.0 3072 0.5948 0.9408
0.0007 25.0 3200 0.5976 0.9408
0.0006 26.0 3328 0.5907 0.9385
0.0 27.0 3456 0.5985 0.9445
0.0009 28.0 3584 0.6052 0.9409
0.0005 29.0 3712 0.5952 0.9464
0.0006 30.0 3840 0.6037 0.9445
0.0 31.0 3968 0.5994 0.9445
0.0006 32.0 4096 0.6061 0.9409
0.0005 33.0 4224 0.6046 0.9445
0.0006 34.0 4352 0.6007 0.9445
0.0 35.0 4480 0.6052 0.9445
0.0006 36.0 4608 0.6091 0.9425
0.0003 37.0 4736 0.6106 0.9445
0.0008 38.0 4864 0.6117 0.9445
0.0002 39.0 4992 0.6054 0.9464
0.0 40.0 5120 0.6080 0.9443
0.0004 41.0 5248 0.6117 0.9443
0.0011 42.0 5376 0.6152 0.9443
0.0004 43.0 5504 0.6164 0.9425
0.0007 44.0 5632 0.6188 0.9425
0.0005 45.0 5760 0.6104 0.9442
0.0 46.0 5888 0.6164 0.9446
0.0006 47.0 6016 0.6104 0.9439
0.0002 48.0 6144 0.6129 0.9443
0.0007 49.0 6272 0.6205 0.9424
0.0006 50.0 6400 0.6182 0.9425
0.0006 51.0 6528 0.6113 0.9439
0.0 52.0 6656 0.6212 0.9443
0.0005 53.0 6784 0.6186 0.9440
0.0004 54.0 6912 0.6163 0.9440
0.0 55.0 7040 0.6172 0.9440
0.0008 56.0 7168 0.6163 0.9440
0.0005 57.0 7296 0.6211 0.9440
0.0006 58.0 7424 0.6232 0.9422
0.0003 59.0 7552 0.6240 0.9440
0.0003 60.0 7680 0.6224 0.9440
0.0005 61.0 7808 0.6273 0.9419
0.0006 62.0 7936 0.6239 0.9423
0.0001 63.0 8064 0.6216 0.9419
0.0004 64.0 8192 0.6191 0.9420
0.0005 65.0 8320 0.6169 0.9420
0.0 66.0 8448 0.6201 0.9420
0.0005 67.0 8576 0.6218 0.9402
0.0004 68.0 8704 0.6195 0.9421
0.0004 69.0 8832 0.6246 0.9402
0.0002 70.0 8960 0.6269 0.9420
0.0003 71.0 9088 0.6268 0.9402
0.0005 72.0 9216 0.6254 0.9418
0.0 73.0 9344 0.6273 0.9402
0.0007 74.0 9472 0.6257 0.9437
0.0005 75.0 9600 0.6213 0.9399
0.0005 76.0 9728 0.6266 0.9418
0.0002 77.0 9856 0.6258 0.9418
0.0005 78.0 9984 0.6298 0.9418
0.0003 79.0 10112 0.6242 0.9439
0.0002 80.0 10240 0.6284 0.9418
0.0008 81.0 10368 0.6255 0.9439
0.0 82.0 10496 0.6312 0.9439
0.0005 83.0 10624 0.6312 0.9399
0.0002 84.0 10752 0.6279 0.9381
0.0005 85.0 10880 0.6295 0.9401
0.0005 86.0 11008 0.6231 0.9433
0.0005 87.0 11136 0.6302 0.9433
0.0002 88.0 11264 0.6281 0.9433
0.0003 89.0 11392 0.6326 0.9433
0.0002 90.0 11520 0.6347 0.9418
0.0005 91.0 11648 0.6324 0.9418
0.0007 92.0 11776 0.6362 0.9418
0.0005 93.0 11904 0.6351 0.9433
0.0004 94.0 12032 0.6372 0.9433
0.0002 95.0 12160 0.6347 0.9433
0.0005 96.0 12288 0.6378 0.9418
0.0005 97.0 12416 0.6384 0.9418
0.0005 98.0 12544 0.6449 0.9418
0.0 99.0 12672 0.6418 0.9433
0.0005 100.0 12800 0.6540 0.9454
0.0005 101.0 12928 0.6413 0.9466
0.0111 102.0 13056 0.5095 0.9187
0.0623 103.0 13184 0.5184 0.9350
0.0167 104.0 13312 0.5990 0.9222
0.0052 105.0 13440 0.6861 0.9409
0.0066 106.0 13568 0.6613 0.9455
0.0003 107.0 13696 0.6736 0.9462
0.0002 108.0 13824 0.6888 0.9446
0.0005 109.0 13952 0.6931 0.9462
0.0004 110.0 14080 0.6953 0.9462
0.0002 111.0 14208 0.6987 0.9462
0.0002 112.0 14336 0.7009 0.9462
0.0006 113.0 14464 0.7038 0.9462
0.0 114.0 14592 0.7079 0.9462
0.0004 115.0 14720 0.7073 0.9462
0.0 116.0 14848 0.7094 0.9462
0.0008 117.0 14976 0.7091 0.9462
0.0004 118.0 15104 0.7108 0.9462
0.0004 119.0 15232 0.7111 0.9462
0.0002 120.0 15360 0.7138 0.9462
0.0004 121.0 15488 0.7149 0.9462
0.0004 122.0 15616 0.7144 0.9462
0.0004 123.0 15744 0.7164 0.9462
0.0004 124.0 15872 0.7178 0.9462
0.0004 125.0 16000 0.7178 0.9462
0.0002 126.0 16128 0.7191 0.9462
0.0007 127.0 16256 0.7189 0.9462
0.0002 128.0 16384 0.7203 0.9462
0.0004 129.0 16512 0.7215 0.9462
0.0 130.0 16640 0.7221 0.9462
0.0009 131.0 16768 0.7232 0.9462
0.0 132.0 16896 0.7236 0.9462
0.0002 133.0 17024 0.7242 0.9462
0.0002 134.0 17152 0.7253 0.9462
0.0004 135.0 17280 0.7247 0.9462
0.0002 136.0 17408 0.7248 0.9462
0.0004 137.0 17536 0.7265 0.9462
0.0002 138.0 17664 0.7264 0.9462
0.0003 139.0 17792 0.7306 0.9462
0.0004 140.0 17920 0.7302 0.9462
0.0002 141.0 18048 0.7304 0.9462
0.0004 142.0 18176 0.7307 0.9462
0.0002 143.0 18304 0.7325 0.9462
0.0002 144.0 18432 0.7324 0.9462
0.0007 145.0 18560 0.7321 0.9462
0.0 146.0 18688 0.7324 0.9462
0.0004 147.0 18816 0.7355 0.9462
0.0004 148.0 18944 0.7348 0.9462
0.0002 149.0 19072 0.7355 0.9462
0.0004 150.0 19200 0.7357 0.9462
0.0004 151.0 19328 0.7371 0.9462
0.0002 152.0 19456 0.7374 0.9462
0.0004 153.0 19584 0.7384 0.9462
0.0002 154.0 19712 0.7387 0.9462
0.0004 155.0 19840 0.7390 0.9462
0.0004 156.0 19968 0.7396 0.9462
0.0002 157.0 20096 0.7400 0.9462
0.0002 158.0 20224 0.7420 0.9462
0.0004 159.0 20352 0.7391 0.9462
0.0006 160.0 20480 0.7420 0.9462
0.0004 161.0 20608 0.7428 0.9462
0.0004 162.0 20736 0.7436 0.9462
0.0002 163.0 20864 0.7442 0.9462
0.0002 164.0 20992 0.7444 0.9462
0.0004 165.0 21120 0.7451 0.9446
0.0002 166.0 21248 0.7450 0.9462
0.0004 167.0 21376 0.7452 0.9462
0.0002 168.0 21504 0.7478 0.9462
0.0004 169.0 21632 0.7467 0.9462
0.0002 170.0 21760 0.7467 0.9480
0.0006 171.0 21888 0.7491 0.9462
0.0004 172.0 22016 0.7489 0.9462
0.0002 173.0 22144 0.7491 0.9462
0.0004 174.0 22272 0.7503 0.9462
0.0002 175.0 22400 0.7513 0.9462
0.0004 176.0 22528 0.7496 0.9462
0.0 177.0 22656 0.7511 0.9462
0.0006 178.0 22784 0.7508 0.9480
0.0002 179.0 22912 0.7519 0.9462
0.0004 180.0 23040 0.7535 0.9462
0.0002 181.0 23168 0.7534 0.9462
0.0002 182.0 23296 0.7530 0.9462
0.0004 183.0 23424 0.7522 0.9480
0.0002 184.0 23552 0.7524 0.9462
0.0002 185.0 23680 0.7529 0.9462
0.0 186.0 23808 0.7536 0.9462
0.0002 187.0 23936 0.7550 0.9480
0.0006 188.0 24064 0.7549 0.9462
0.0 189.0 24192 0.7532 0.9480
0.0004 190.0 24320 0.7556 0.9462
0.0004 191.0 24448 0.7546 0.9462
0.0002 192.0 24576 0.7553 0.9462
0.0004 193.0 24704 0.7571 0.9462
0.0002 194.0 24832 0.7551 0.9480
0.0006 195.0 24960 0.7559 0.9480
0.0002 196.0 25088 0.7552 0.9462
0.0002 197.0 25216 0.7560 0.9480
0.0002 198.0 25344 0.7562 0.9480
0.0004 199.0 25472 0.7552 0.9480
0.0002 200.0 25600 0.7555 0.9462

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.0
Downloads last month
33
Safetensors
Model size
150M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for BenPhan/ST1_modernbert-base_hazard-category_V1

Finetuned
(226)
this model