mbart-large-50-b4-e4-lr2e-05-126k-jupyter
This model is a fine-tuned version of facebook/mbart-large-50 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.1228
- Bleu: 89.8947
- Gen Len Ratio: 1.0215
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- optimizer: Use OptimizerNames.ADAFACTOR and the args are: No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 4
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len Ratio |
---|---|---|---|---|---|
0.2859 | 0.3173 | 10000 | 0.1897 | 83.883 | 0.9985 |
0.2495 | 0.6346 | 20000 | 0.1607 | 85.2621 | 0.9957 |
0.1575 | 0.9519 | 30000 | 0.1373 | 89.4101 | 1.0267 |
0.0923 | 1.2692 | 40000 | 0.1355 | 87.5438 | 1.0071 |
0.0831 | 1.5865 | 50000 | 0.1292 | 89.4781 | 1.0206 |
0.0817 | 1.9038 | 60000 | 0.1228 | 89.8947 | 1.0215 |
0.0489 | 2.2211 | 70000 | 0.1386 | 90.0854 | 1.0228 |
0.0353 | 2.5384 | 80000 | 0.1381 | 89.9357 | 1.0194 |
0.0346 | 2.8557 | 90000 | 0.1348 | 89.8965 | 1.0178 |
0.0231 | 3.1730 | 100000 | 0.1524 | 87.853 | 0.9987 |
0.0138 | 3.4903 | 110000 | 0.1521 | 89.1289 | 1.0104 |
0.0129 | 3.8076 | 120000 | 0.1505 | 89.0184 | 1.008 |
Framework versions
- Transformers 4.49.0
- Pytorch 2.6.0+cu124
- Datasets 3.4.1
- Tokenizers 0.21.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for fresst/mbart-large-50-b4-e4-lr2e-05-126k-jupyter
Base model
facebook/mbart-large-50