LED_sum_challenge2 / README.md
floflodebilbao's picture
End of training
065ede7 verified
metadata
library_name: transformers
license: apache-2.0
base_model: allenai/led-base-16384
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: LED_sum_challenge2
    results: []

LED_sum_challenge2

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9586
  • Rouge1: 0.2918
  • Rouge2: 0.1012
  • Rougel: 0.2293
  • Rougelsum: 0.2288
  • Gen Len: 28.12
  • Bleu: 0.0548
  • Precisions: 0.1048
  • Brevity Penalty: 0.9001
  • Length Ratio: 0.9048
  • Translation Length: 1093.0
  • Reference Length: 1208.0
  • Precision: 0.8818
  • Recall: 0.8759
  • F1: 0.8788
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
9.0848 1.0 13 7.5283 0.24 0.0579 0.1713 0.1714 31.78 0.0296 0.0629 1.0 1.0439 1261.0 1208.0 0.8521 0.8597 0.8558 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.171 2.0 26 4.9217 0.2695 0.0854 0.203 0.2033 25.98 0.0368 0.0987 0.8063 0.8228 994.0 1208.0 0.8806 0.8705 0.8755 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.4536 3.0 39 4.1312 0.2717 0.0862 0.2162 0.2157 23.34 0.0352 0.1067 0.6694 0.7136 862.0 1208.0 0.8846 0.8732 0.8788 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.7683 4.0 52 3.7332 0.3043 0.0981 0.2301 0.2308 25.46 0.0499 0.1154 0.7784 0.7997 966.0 1208.0 0.8885 0.8787 0.8835 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.3278 5.0 65 3.4699 0.2978 0.1041 0.2351 0.2344 25.38 0.0497 0.1117 0.7854 0.8055 973.0 1208.0 0.8869 0.8763 0.8815 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.0332 6.0 78 3.2808 0.2946 0.1013 0.2335 0.2319 26.48 0.0503 0.1069 0.8181 0.8328 1006.0 1208.0 0.8857 0.8774 0.8815 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.8037 7.0 91 3.1443 0.295 0.0965 0.2275 0.2264 27.52 0.0428 0.0978 0.8612 0.87 1051.0 1208.0 0.8822 0.8777 0.8799 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.637 8.0 104 3.0523 0.2834 0.0997 0.2263 0.2257 27.22 0.0499 0.1034 0.8527 0.8626 1042.0 1208.0 0.8813 0.8752 0.8781 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.5158 9.0 117 2.9900 0.2821 0.0989 0.2271 0.2273 27.18 0.0508 0.1051 0.848 0.8584 1037.0 1208.0 0.8842 0.8773 0.8806 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.4321 10.0 130 2.9586 0.2918 0.1012 0.2293 0.2288 28.12 0.0548 0.1048 0.9001 0.9048 1093.0 1208.0 0.8818 0.8759 0.8788 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1