floflodebilbao's picture
End of training
8dade40 verified
metadata
library_name: peft
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: Lora_long_T5_sum_challenge
    results: []

Lora_long_T5_sum_challenge

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.2142
  • Rouge1: 0.2852
  • Rouge2: 0.0966
  • Rougel: 0.2231
  • Rougelsum: 0.2243
  • Gen Len: 28.38
  • Bleu: 0.0405
  • Precisions: 0.0919
  • Brevity Penalty: 0.8771
  • Length Ratio: 0.8841
  • Translation Length: 1068.0
  • Reference Length: 1208.0
  • Precision: 0.8739
  • Recall: 0.8718
  • F1: 0.8728
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
22.2581 1.0 7 6.5353 0.084 0.0147 0.0714 0.0714 31.0 0.0047 0.0247 0.5558 0.63 761.0 1208.0 0.7817 0.8234 0.8014 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.7792 2.0 14 5.1759 0.1642 0.0129 0.13 0.1296 30.46 0.0 0.044 0.755 0.7806 943.0 1208.0 0.8343 0.8356 0.8349 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.5124 3.0 21 3.7445 0.2094 0.0517 0.1669 0.1666 28.9 0.021 0.0606 0.8336 0.846 1022.0 1208.0 0.8516 0.8529 0.8521 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.5042 4.0 28 3.1497 0.2314 0.0579 0.1774 0.1772 29.1 0.0317 0.0716 0.8537 0.8634 1043.0 1208.0 0.855 0.8584 0.8567 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.8574 5.0 35 2.0950 0.2342 0.0664 0.1895 0.1897 28.34 0.0325 0.0756 0.8584 0.8675 1048.0 1208.0 0.8581 0.8605 0.8593 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.0046 6.0 42 1.4599 0.2643 0.0843 0.2074 0.2081 28.18 0.036 0.0853 0.8678 0.8758 1058.0 1208.0 0.8665 0.8652 0.8658 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.4948 7.0 49 1.2786 0.2831 0.0921 0.2203 0.2208 28.3 0.0413 0.0893 0.8855 0.8916 1077.0 1208.0 0.8703 0.8681 0.8691 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.2731 8.0 56 1.2338 0.2802 0.096 0.2204 0.2221 28.26 0.0406 0.0893 0.8753 0.8825 1066.0 1208.0 0.8729 0.8705 0.8717 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.1977 9.0 63 1.2179 0.2834 0.0991 0.2233 0.2244 28.42 0.0409 0.0919 0.8725 0.88 1063.0 1208.0 0.8745 0.8722 0.8733 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.1717 10.0 70 1.2142 0.2852 0.0966 0.2231 0.2243 28.38 0.0405 0.0919 0.8771 0.8841 1068.0 1208.0 0.8739 0.8718 0.8728 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • PEFT 0.15.2
  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1