long_T5_sum_outcome / README.md
floflodebilbao's picture
End of training
e573340 verified
metadata
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
  - generated_from_trainer
metrics:
  - rouge
  - bleu
  - precision
  - recall
  - f1
model-index:
  - name: long_T5_sum_outcome
    results: []

long_T5_sum_outcome

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 16.6670
  • Rouge1: 0.179
  • Rouge2: 0.0376
  • Rougel: 0.1311
  • Rougelsum: 0.1303
  • Gen Len: 31.0
  • Bleu: 0.0183
  • Precisions: 0.0456
  • Brevity Penalty: 0.952
  • Length Ratio: 0.9531
  • Translation Length: 1117.0
  • Reference Length: 1172.0
  • Precision: 0.839
  • Recall: 0.8508
  • F1: 0.8447
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
26.8047 1.0 7 28.1961 0.1991 0.0452 0.1451 0.1451 31.0 0.0226 0.0516 0.9697 0.9701 1137.0 1172.0 0.8417 0.854 0.8477 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
24.4577 2.0 14 25.5517 0.1938 0.0452 0.1398 0.14 31.0 0.0225 0.0511 0.9617 0.9625 1128.0 1172.0 0.8404 0.8526 0.8464 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
22.7574 3.0 21 23.4583 0.1872 0.0412 0.1373 0.1368 31.0 0.0218 0.0488 0.9582 0.959 1124.0 1172.0 0.8393 0.8517 0.8454 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
21.0685 4.0 28 21.7844 0.1858 0.0425 0.1358 0.1353 31.0 0.0196 0.0485 0.9635 0.9642 1130.0 1172.0 0.8407 0.8525 0.8465 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
20.037 5.0 35 20.3738 0.1858 0.0425 0.1358 0.1353 31.0 0.0196 0.0485 0.9635 0.9642 1130.0 1172.0 0.8407 0.8525 0.8465 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
18.9602 6.0 42 19.1832 0.1862 0.0397 0.1322 0.1313 31.0 0.0194 0.0484 0.9546 0.9556 1120.0 1172.0 0.8396 0.8512 0.8453 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
18.0035 7.0 49 18.1774 0.1853 0.0397 0.1321 0.1313 31.0 0.0194 0.0482 0.9555 0.9565 1121.0 1172.0 0.8394 0.8511 0.8451 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
17.2636 8.0 56 17.4050 0.1862 0.0396 0.1328 0.132 31.0 0.0195 0.0478 0.9493 0.9505 1114.0 1172.0 0.8389 0.8507 0.8447 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
16.5914 9.0 63 16.8775 0.179 0.0376 0.1302 0.1293 31.0 0.019 0.046 0.9502 0.9514 1115.0 1172.0 0.8384 0.8504 0.8443 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
16.2789 10.0 70 16.6670 0.179 0.0376 0.1311 0.1303 31.0 0.0183 0.0456 0.952 0.9531 1117.0 1172.0 0.839 0.8508 0.8447 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1