floflodebilbao's picture
End of training
654f537 verified
metadata
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: long_T5_sum_challenge
    results: []

long_T5_sum_challenge

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.6212
  • Rouge1: 0.1351
  • Rouge2: 0.0198
  • Rougel: 0.1098
  • Rougelsum: 0.1097
  • Gen Len: 20.0
  • Bleu: 0.0
  • Precisions: 0.0382
  • Brevity Penalty: 0.516
  • Length Ratio: 0.6018
  • Translation Length: 727.0
  • Reference Length: 1208.0
  • Precision: 0.8374
  • Recall: 0.8398
  • F1: 0.8385
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
No log 1.0 7 28.1900 0.1279 0.0207 0.1007 0.1007 20.0 0.0 0.0385 0.5476 0.6242 754.0 1208.0 0.8406 0.8397 0.8401 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 2.0 14 25.4058 0.1405 0.0253 0.1073 0.1077 20.0 0.0 0.0427 0.5395 0.6184 747.0 1208.0 0.8426 0.8415 0.842 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 3.0 21 23.2077 0.145 0.0273 0.1108 0.1107 20.0 0.0078 0.045 0.5453 0.6225 752.0 1208.0 0.8434 0.8418 0.8425 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 4.0 28 21.4668 0.1471 0.0265 0.115 0.114 20.0 0.007 0.0437 0.543 0.6209 750.0 1208.0 0.8438 0.8422 0.8429 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 5.0 35 19.9034 0.1508 0.0275 0.1178 0.117 20.0 0.0072 0.0442 0.5453 0.6225 752.0 1208.0 0.8444 0.8425 0.8434 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 6.0 42 18.3550 0.1519 0.0288 0.1197 0.1189 20.0 0.0071 0.0437 0.536 0.6159 744.0 1208.0 0.8446 0.8428 0.8436 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 7.0 49 16.6792 0.1515 0.0289 0.1195 0.1194 20.0 0.0072 0.0442 0.5407 0.6192 748.0 1208.0 0.8453 0.8427 0.8439 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 8.0 56 14.7162 0.1445 0.0228 0.116 0.1157 20.0 0.0 0.0391 0.5348 0.6151 743.0 1208.0 0.8427 0.8409 0.8417 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 9.0 63 12.4721 0.1556 0.0205 0.1205 0.1208 20.0 0.0 0.0391 0.5325 0.6134 741.0 1208.0 0.8449 0.8421 0.8435 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 10.0 70 10.1636 0.1562 0.0263 0.1245 0.1238 20.0 0.0 0.0418 0.5407 0.6192 748.0 1208.0 0.845 0.8433 0.8441 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 11.0 77 8.3358 0.142 0.0231 0.1155 0.1153 20.0 0.0 0.0401 0.5125 0.5993 724.0 1208.0 0.8395 0.8411 0.8402 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 12.0 84 7.6212 0.1351 0.0198 0.1098 0.1097 20.0 0.0 0.0382 0.516 0.6018 727.0 1208.0 0.8374 0.8398 0.8385 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1