modernbert-ct4a-11-no-aug

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4779
  • Accuracy: 0.9075
  • F1: 0.7601
  • Auc: 0.8332
  • Accuracy Per Label: [0.9051094890510949, 0.9197080291970803, 0.8978102189781022]
  • F1 Per Label: [0.7547169811320755, 0.7441860465116279, 0.78125]
  • Auc Per Label: [0.853083853083853, 0.8031878031878031, 0.8433752141633354]

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Auc Accuracy Per Label F1 Per Label Auc Per Label
No log 1.0 154 0.2880 0.8686 0.5709 0.7054 [0.8759124087591241, 0.8686131386861314, 0.8613138686131386] [0.5853658536585366, 0.5, 0.6274509803921569] [0.7172557172557175, 0.6685724185724186, 0.73043974871502]
No log 2.0 308 0.2495 0.8954 0.6899 0.7750 [0.8686131386861314, 0.9197080291970803, 0.8978102189781022] [0.5714285714285714, 0.7317073170731707, 0.7666666666666667] [0.7127512127512129, 0.7884615384615384, 0.8236721873215306]
No log 3.0 462 0.3805 0.9100 0.7742 0.8496 [0.9051094890510949, 0.9197080291970803, 0.9051094890510949] [0.7450980392156863, 0.7659574468085106, 0.8115942028985508] [0.8383575883575884, 0.8326403326403327, 0.8777841233580811]
0.1833 4.0 616 0.4659 0.9051 0.7504 0.8234 [0.9051094890510949, 0.9197080291970803, 0.8905109489051095] [0.7450980392156863, 0.7441860465116279, 0.7619047619047619] [0.8383575883575884, 0.8031878031878031, 0.8286693318103941]
0.1833 5.0 770 0.4779 0.9075 0.7601 0.8332 [0.9051094890510949, 0.9197080291970803, 0.8978102189781022] [0.7547169811320755, 0.7441860465116279, 0.78125] [0.853083853083853, 0.8031878031878031, 0.8433752141633354]

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
17
Safetensors
Model size
396M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for pawan2411/modernbert-ct4a-11-no-aug

Finetuned
(112)
this model