modernbert-large-docx

This model is a fine-tuned version of answerdotai/ModernBERT-large on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.5145

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 1
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 148
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss
0.5541 0.1686 100 0.5688
0.5465 0.3373 200 0.5476
0.5054 0.5059 300 0.5369
0.5113 0.6745 400 0.5335
0.5281 0.8432 500 0.5354
0.5441 1.0118 600 0.5312
0.4983 1.1804 700 0.5269
0.5151 1.3491 800 0.5257
0.5247 1.5177 900 0.5258
0.5212 1.6863 1000 0.5343
0.5243 1.8550 1100 0.5190
0.5007 2.0236 1200 0.5206
0.4971 2.1922 1300 0.5260
0.504 2.3609 1400 0.5264
0.5152 2.5295 1500 0.5229
0.5269 2.6981 1600 0.5264
0.5202 2.8668 1700 0.5282
0.5117 3.0354 1800 0.5179
0.5163 3.2040 1900 0.5168
0.4929 3.3727 2000 0.5165
0.5017 3.5413 2100 0.5151
0.5031 3.7099 2200 0.5155
0.52 3.8786 2300 0.5155
0.5055 4.0472 2400 0.5143
0.4968 4.2159 2500 0.5138
0.4868 4.3845 2600 0.5147
0.4888 4.5531 2700 0.5145
0.4994 4.7218 2800 0.5145
0.4911 4.8904 2900 0.5145

Framework versions

  • Transformers 4.48.0.dev0
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
396M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for m8than/modernbert-large-docx

Finetuned
(113)
this model