metadata
library_name: transformers
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
- bleu
- precision
- recall
- f1
model-index:
- name: long_T5_sum_outcome
results: []
long_T5_sum_outcome
This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 16.6670
- Rouge1: 0.179
- Rouge2: 0.0376
- Rougel: 0.1311
- Rougelsum: 0.1303
- Gen Len: 31.0
- Bleu: 0.0183
- Precisions: 0.0456
- Brevity Penalty: 0.952
- Length Ratio: 0.9531
- Translation Length: 1117.0
- Reference Length: 1172.0
- Precision: 0.839
- Recall: 0.8508
- F1: 0.8447
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
26.8047 | 1.0 | 7 | 28.1961 | 0.1991 | 0.0452 | 0.1451 | 0.1451 | 31.0 | 0.0226 | 0.0516 | 0.9697 | 0.9701 | 1137.0 | 1172.0 | 0.8417 | 0.854 | 0.8477 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
24.4577 | 2.0 | 14 | 25.5517 | 0.1938 | 0.0452 | 0.1398 | 0.14 | 31.0 | 0.0225 | 0.0511 | 0.9617 | 0.9625 | 1128.0 | 1172.0 | 0.8404 | 0.8526 | 0.8464 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
22.7574 | 3.0 | 21 | 23.4583 | 0.1872 | 0.0412 | 0.1373 | 0.1368 | 31.0 | 0.0218 | 0.0488 | 0.9582 | 0.959 | 1124.0 | 1172.0 | 0.8393 | 0.8517 | 0.8454 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
21.0685 | 4.0 | 28 | 21.7844 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
20.037 | 5.0 | 35 | 20.3738 | 0.1858 | 0.0425 | 0.1358 | 0.1353 | 31.0 | 0.0196 | 0.0485 | 0.9635 | 0.9642 | 1130.0 | 1172.0 | 0.8407 | 0.8525 | 0.8465 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
18.9602 | 6.0 | 42 | 19.1832 | 0.1862 | 0.0397 | 0.1322 | 0.1313 | 31.0 | 0.0194 | 0.0484 | 0.9546 | 0.9556 | 1120.0 | 1172.0 | 0.8396 | 0.8512 | 0.8453 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
18.0035 | 7.0 | 49 | 18.1774 | 0.1853 | 0.0397 | 0.1321 | 0.1313 | 31.0 | 0.0194 | 0.0482 | 0.9555 | 0.9565 | 1121.0 | 1172.0 | 0.8394 | 0.8511 | 0.8451 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
17.2636 | 8.0 | 56 | 17.4050 | 0.1862 | 0.0396 | 0.1328 | 0.132 | 31.0 | 0.0195 | 0.0478 | 0.9493 | 0.9505 | 1114.0 | 1172.0 | 0.8389 | 0.8507 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
16.5914 | 9.0 | 63 | 16.8775 | 0.179 | 0.0376 | 0.1302 | 0.1293 | 31.0 | 0.019 | 0.046 | 0.9502 | 0.9514 | 1115.0 | 1172.0 | 0.8384 | 0.8504 | 0.8443 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
16.2789 | 10.0 | 70 | 16.6670 | 0.179 | 0.0376 | 0.1311 | 0.1303 | 31.0 | 0.0183 | 0.0456 | 0.952 | 0.9531 | 1117.0 | 1172.0 | 0.839 | 0.8508 | 0.8447 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
Framework versions
- Transformers 4.53.1
- Pytorch 2.7.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1