clapAI/ModernBERT-base-VSMEC-ep50

This model is a fine-tuned version of answerdotai/ModernBERT-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0842
  • Micro F1: 51.1662
  • Micro Precision: 51.1662
  • Micro Recall: 51.1662
  • Macro F1: 43.7557
  • Macro Precision: 45.2358
  • Macro Recall: 43.1885

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • distributed_type: multi-GPU
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Use adamw_torch_fused with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.01
  • num_epochs: 50.0

Training results

Training Loss Epoch Step Validation Loss Micro F1 Micro Precision Micro Recall Macro F1 Macro Precision Macro Recall
7.1297 1.0 22 1.7270 29.1545 29.1545 29.1545 16.9387 15.9469 18.7182
6.7 2.0 44 1.6690 31.7784 31.7784 31.7784 16.4776 28.9622 19.6919
6.4734 3.0 66 1.5725 39.3586 39.3586 39.3586 23.8242 23.6632 27.7820
5.625 4.0 88 1.4990 42.2741 42.2741 42.2741 27.8517 34.6415 28.7210
4.7406 5.0 110 1.4739 45.7726 45.7726 45.7726 38.7619 41.6066 38.3720
4.2664 6.0 132 1.4690 44.4606 44.4606 44.4606 34.9572 38.4359 35.7133
2.968 7.0 154 1.5695 46.3557 46.3557 46.3557 38.7596 44.9462 38.5572
1.8371 8.0 176 1.6301 48.8338 48.8338 48.8338 40.1485 44.4797 39.6512
0.998 9.0 198 1.7963 47.2303 47.2303 47.2303 39.2695 44.6568 38.6126
0.3793 10.0 220 1.8679 50.1458 50.1458 50.1458 40.9779 44.9562 39.9607
0.1684 11.0 242 2.1732 49.2711 49.2711 49.2711 41.2788 45.8904 39.6737
0.107 12.0 264 2.3095 46.6472 46.6472 46.6472 39.2907 44.2685 38.2952
0.0577 13.0 286 2.3684 48.3965 48.3965 48.3965 40.5603 45.0383 39.7106
0.1392 14.0 308 2.6708 47.5219 47.5219 47.5219 39.7073 45.9608 38.7368
0.0391 15.0 330 2.5045 48.6880 48.6880 48.6880 41.3458 44.3799 40.3860
0.0709 16.0 352 2.7772 49.8542 49.8542 49.8542 41.1807 49.0005 39.2832
0.0353 17.0 374 2.6289 46.3557 46.3557 46.3557 39.5522 41.4650 39.3228
0.0168 18.0 396 2.6308 47.5219 47.5219 47.5219 42.2655 44.2762 41.7151
0.046 19.0 418 2.6696 47.2303 47.2303 47.2303 40.3353 41.2212 40.6420
0.0226 20.0 440 2.6834 49.7085 49.7085 49.7085 43.1286 44.1373 42.6577
0.0104 21.0 462 2.9119 48.9796 48.9796 48.9796 42.5756 45.0299 42.4775
0.0116 22.0 484 3.1352 47.8134 47.8134 47.8134 41.2449 45.3324 40.3429
0.0135 23.0 506 2.8475 51.0204 51.0204 51.0204 44.5566 48.2813 43.4719
0.006 24.0 528 3.0071 50.0 50.0 50.0 43.0101 44.3103 42.8316
0.0014 25.0 550 3.0842 51.1662 51.1662 51.1662 43.7557 45.2358 43.1885
0.0004 26.0 572 3.1024 48.2507 48.2507 48.2507 41.5677 43.0218 40.8334
0.0002 27.0 594 3.1003 49.7085 49.7085 49.7085 43.6808 44.7579 43.1511
0.0067 28.0 616 3.1205 49.2711 49.2711 49.2711 42.6753 44.2513 41.9035
0.0051 29.0 638 3.1366 49.2711 49.2711 49.2711 42.5911 44.0526 41.8990
0.0001 30.0 660 3.1395 49.8542 49.8542 49.8542 44.0969 45.2099 43.5805
0.0001 31.0 682 3.1607 49.4169 49.4169 49.4169 43.5479 44.8908 42.9051
0.0001 32.0 704 3.1695 48.9796 48.9796 48.9796 42.7450 44.1869 42.1162
0.003 33.0 726 3.1716 49.5627 49.5627 49.5627 43.5619 44.8377 43.0366
0.0032 34.0 748 3.1751 49.5627 49.5627 49.5627 43.6102 44.8916 43.1014
0.0001 35.0 770 3.1795 49.7085 49.7085 49.7085 43.4435 44.8682 42.8522
0.0001 36.0 792 3.1832 49.5627 49.5627 49.5627 43.1845 44.6756 42.5562
0.0001 37.0 814 3.1832 49.7085 49.7085 49.7085 43.7243 44.9679 43.2027
0.0035 38.0 836 3.1926 49.7085 49.7085 49.7085 43.8761 45.0737 43.3237
0.0001 39.0 858 3.1918 49.5627 49.5627 49.5627 43.6425 44.9685 43.0969
0.0001 40.0 880 3.1918 49.8542 49.8542 49.8542 44.0461 45.3099 43.5243
0.0001 41.0 902 3.1860 49.4169 49.4169 49.4169 43.5586 44.8246 43.0302
0.0001 42.0 924 3.1926 49.2711 49.2711 49.2711 43.4018 44.6383 42.8685
0.0001 43.0 946 3.2005 49.4169 49.4169 49.4169 43.3200 44.5562 42.7841
0.0019 44.0 968 3.1939 49.4169 49.4169 49.4169 43.3207 44.6480 42.7451
0.002 45.0 990 3.1903 49.5627 49.5627 49.5627 43.7072 44.8315 43.2268
0.002 46.0 1012 3.1801 50.1458 50.1458 50.1458 44.2446 45.4932 43.7014
0.0001 47.0 1034 3.1941 49.5627 49.5627 49.5627 43.5439 44.7510 43.0411
0.0001 47.7356 1050 3.1976 49.5627 49.5627 49.5627 43.4123 44.6492 42.9112

Framework versions

  • Transformers 4.50.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.15.0
  • Tokenizers 0.21.1
Downloads last month
0
Safetensors
Model size
150M params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for clapAI/ModernBERT-base-VSMEC-ep50

Finetuned
(529)
this model