long_T5_sum_approach

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.9110
  • Rouge1: 0.1621
  • Rouge2: 0.032
  • Rougel: 0.1254
  • Rougelsum: 0.126
  • Gen Len: 20.0
  • Bleu: 0.0
  • Precisions: 0.051
  • Brevity Penalty: 0.5104
  • Length Ratio: 0.5979
  • Translation Length: 730.0
  • Reference Length: 1221.0
  • Precision: 0.84
  • Recall: 0.8425
  • F1: 0.8411
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 12

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
No log 1.0 7 25.6156 0.2003 0.0566 0.1655 0.1653 20.0 0.0197 0.072 0.5371 0.6167 753.0 1221.0 0.8574 0.8505 0.8538 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 2.0 14 23.0400 0.1953 0.0521 0.1616 0.1611 20.0 0.0186 0.0701 0.5302 0.6118 747.0 1221.0 0.8566 0.8497 0.8531 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 3.0 21 20.9401 0.1905 0.0494 0.157 0.1574 20.0 0.0145 0.0629 0.5394 0.6183 755.0 1221.0 0.8581 0.8507 0.8543 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 4.0 28 19.2980 0.1983 0.0521 0.1626 0.1635 20.0 0.0157 0.0682 0.5337 0.6143 750.0 1221.0 0.8595 0.8511 0.8552 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 5.0 35 17.7768 0.2064 0.0605 0.1704 0.1719 20.0 0.0217 0.0758 0.5302 0.6118 747.0 1221.0 0.8612 0.8524 0.8567 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 6.0 42 16.2705 0.2071 0.0636 0.1691 0.1699 20.0 0.0273 0.0818 0.5313 0.6126 748.0 1221.0 0.8599 0.851 0.8553 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 7.0 49 14.6608 0.2018 0.0607 0.1676 0.1689 20.0 0.0263 0.0797 0.5244 0.6077 742.0 1221.0 0.8575 0.85 0.8536 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 8.0 56 12.7872 0.1914 0.0533 0.1564 0.1566 20.0 0.0263 0.0771 0.5267 0.6093 744.0 1221.0 0.8528 0.8469 0.8498 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 9.0 63 10.6116 0.2038 0.0562 0.1631 0.1637 20.0 0.0295 0.0836 0.5232 0.6069 741.0 1221.0 0.8542 0.8486 0.8513 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 10.0 70 8.3062 0.1963 0.0497 0.1559 0.1558 20.0 0.0244 0.0747 0.5244 0.6077 742.0 1221.0 0.8504 0.8472 0.8487 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 11.0 77 6.5339 0.1794 0.0401 0.1404 0.1411 20.0 0.0182 0.0649 0.5186 0.6036 737.0 1221.0 0.8448 0.8443 0.8444 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 12.0 84 5.9110 0.1621 0.032 0.1254 0.126 20.0 0.0 0.051 0.5104 0.5979 730.0 1221.0 0.84 0.8425 0.8411 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
4
Safetensors
Model size
297M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for floflodebilbao/long_T5_sum_approach

Finetuned
(34)
this model