modernbert-large-docx

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 2
eval_batch_size: 1
seed: 42
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 148
num_epochs: 5

Training Loss	Epoch	Step	Validation Loss
0.5541	0.1686	100	0.5688
0.5465	0.3373	200	0.5476
0.5054	0.5059	300	0.5369
0.5113	0.6745	400	0.5335
0.5281	0.8432	500	0.5354
0.5441	1.0118	600	0.5312
0.4983	1.1804	700	0.5269
0.5151	1.3491	800	0.5257
0.5247	1.5177	900	0.5258
0.5212	1.6863	1000	0.5343
0.5243	1.8550	1100	0.5190
0.5007	2.0236	1200	0.5206
0.4971	2.1922	1300	0.5260
0.504	2.3609	1400	0.5264
0.5152	2.5295	1500	0.5229
0.5269	2.6981	1600	0.5264
0.5202	2.8668	1700	0.5282
0.5117	3.0354	1800	0.5179
0.5163	3.2040	1900	0.5168
0.4929	3.3727	2000	0.5165
0.5017	3.5413	2100	0.5151
0.5031	3.7099	2200	0.5155
0.52	3.8786	2300	0.5155
0.5055	4.0472	2400	0.5143
0.4968	4.2159	2500	0.5138
0.4868	4.3845	2600	0.5147
0.4888	4.5531	2700	0.5145
0.4994	4.7218	2800	0.5145
0.4911	4.8904	2900	0.5145