modernbert-ct4a-11

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 8
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Auc	Accuracy Per Label	F1 Per Label	Auc Per Label
0.322	1.0	804	0.4753	0.8954	0.6722	0.7638	[0.8686131386861314, 0.9051094890510949, 0.9124087591240876]	[0.55, 0.6666666666666666, 0.8]	[0.6980249480249481, 0.75, 0.8432324386065106]
0.1102	2.0	1608	0.4381	0.9148	0.7919	0.8660	[0.8905109489051095, 0.927007299270073, 0.927007299270073]	[0.7368421052631579, 0.8, 0.8387096774193549]	[0.8588011088011088, 0.8665973665973666, 0.8726442033123929]
0.0806	3.0	2412	0.5094	0.9148	0.7766	0.8411	[0.9197080291970803, 0.9197080291970803, 0.9051094890510949]	[0.7843137254901961, 0.7659574468085106, 0.7796610169491526]	[0.8620928620928621, 0.8326403326403327, 0.8285265562535694]
0.0642	4.0	3216	0.4952	0.9197	0.7894	0.8491	[0.9197080291970803, 0.927007299270073, 0.9124087591240876]	[0.7924528301886793, 0.782608695652174, 0.7931034482758621]	[0.8768191268191268, 0.837144837144837, 0.8333809251856082]
0.0587	5.0	4020	0.4627	0.9246	0.7972	0.8491	[0.9197080291970803, 0.927007299270073, 0.927007299270073]	[0.7755102040816326, 0.782608695652174, 0.8333333333333334]	[0.8473665973665975, 0.837144837144837, 0.8627926898914906]