Lora_long_T5_sum_approach

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8610
  • Rouge1: 0.4804
  • Rouge2: 0.2605
  • Rougel: 0.4126
  • Rougelsum: 0.4141
  • Gen Len: 28.18
  • Bleu: 0.1536
  • Precisions: 0.2428
  • Brevity Penalty: 0.772
  • Length Ratio: 0.7944
  • Translation Length: 970.0
  • Reference Length: 1221.0
  • Precision: 0.914
  • Recall: 0.9024
  • F1: 0.9081
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.002
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
20.3106 1.0 7 4.7803 0.0666 0.0125 0.0566 0.057 31.0 0.0062 0.0248 0.5139 0.6003 733.0 1221.0 0.7656 0.817 0.79 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.3299 2.0 14 4.0381 0.3252 0.1232 0.2298 0.2295 30.3 0.0656 0.1136 0.8066 0.8231 1005.0 1221.0 0.8607 0.8665 0.8635 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.881 3.0 21 3.2332 0.3357 0.141 0.2638 0.2643 28.78 0.0835 0.1391 0.8086 0.8247 1007.0 1221.0 0.8722 0.8719 0.872 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.138 4.0 28 2.8019 0.3883 0.1806 0.3285 0.3283 29.14 0.0964 0.1631 0.7978 0.8157 996.0 1221.0 0.8856 0.8835 0.8845 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.6873 5.0 35 2.2161 0.452 0.2271 0.3854 0.3859 27.96 0.1276 0.2114 0.781 0.8018 979.0 1221.0 0.9067 0.8967 0.9016 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.0184 6.0 42 1.3080 0.463 0.2487 0.4009 0.4028 27.62 0.1481 0.239 0.764 0.7879 962.0 1221.0 0.9111 0.8991 0.9049 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.3413 7.0 49 0.9692 0.4678 0.2529 0.401 0.4025 28.06 0.1473 0.2354 0.773 0.7952 971.0 1221.0 0.9109 0.8996 0.9051 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.0888 8.0 56 0.8996 0.4784 0.259 0.4102 0.4118 28.2 0.1468 0.2363 0.775 0.7969 973.0 1221.0 0.9126 0.9013 0.9068 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
0.9722 9.0 63 0.8690 0.4824 0.262 0.4112 0.4129 28.22 0.1523 0.2416 0.776 0.7977 974.0 1221.0 0.9131 0.9019 0.9074 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
0.948 10.0 70 0.8610 0.4804 0.2605 0.4126 0.4141 28.18 0.1536 0.2428 0.772 0.7944 970.0 1221.0 0.914 0.9024 0.9081 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • PEFT 0.15.2
  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
17
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for floflodebilbao/Lora_long_T5_sum_approach

Adapter
(12)
this model