metadata
library_name: peft
license: apache-2.0
base_model: google/long-t5-tglobal-base
tags:
- generated_from_trainer
metrics:
- rouge
- bleu
- precision
- recall
- f1
model-index:
- name: Lora_long_T5_sum_challenge
results: []
Lora_long_T5_sum_challenge
This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.2142
- Rouge1: 0.2852
- Rouge2: 0.0966
- Rougel: 0.2231
- Rougelsum: 0.2243
- Gen Len: 28.38
- Bleu: 0.0405
- Precisions: 0.0919
- Brevity Penalty: 0.8771
- Length Ratio: 0.8841
- Translation Length: 1068.0
- Reference Length: 1208.0
- Precision: 0.8739
- Recall: 0.8718
- F1: 0.8728
- Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.002
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 16
- total_train_batch_size: 16
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len | Bleu | Precisions | Brevity Penalty | Length Ratio | Translation Length | Reference Length | Precision | Recall | F1 | Hashcode |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
22.2581 | 1.0 | 7 | 6.5353 | 0.084 | 0.0147 | 0.0714 | 0.0714 | 31.0 | 0.0047 | 0.0247 | 0.5558 | 0.63 | 761.0 | 1208.0 | 0.7817 | 0.8234 | 0.8014 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
6.7792 | 2.0 | 14 | 5.1759 | 0.1642 | 0.0129 | 0.13 | 0.1296 | 30.46 | 0.0 | 0.044 | 0.755 | 0.7806 | 943.0 | 1208.0 | 0.8343 | 0.8356 | 0.8349 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
4.5124 | 3.0 | 21 | 3.7445 | 0.2094 | 0.0517 | 0.1669 | 0.1666 | 28.9 | 0.021 | 0.0606 | 0.8336 | 0.846 | 1022.0 | 1208.0 | 0.8516 | 0.8529 | 0.8521 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
3.5042 | 4.0 | 28 | 3.1497 | 0.2314 | 0.0579 | 0.1774 | 0.1772 | 29.1 | 0.0317 | 0.0716 | 0.8537 | 0.8634 | 1043.0 | 1208.0 | 0.855 | 0.8584 | 0.8567 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
2.8574 | 5.0 | 35 | 2.0950 | 0.2342 | 0.0664 | 0.1895 | 0.1897 | 28.34 | 0.0325 | 0.0756 | 0.8584 | 0.8675 | 1048.0 | 1208.0 | 0.8581 | 0.8605 | 0.8593 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
2.0046 | 6.0 | 42 | 1.4599 | 0.2643 | 0.0843 | 0.2074 | 0.2081 | 28.18 | 0.036 | 0.0853 | 0.8678 | 0.8758 | 1058.0 | 1208.0 | 0.8665 | 0.8652 | 0.8658 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
1.4948 | 7.0 | 49 | 1.2786 | 0.2831 | 0.0921 | 0.2203 | 0.2208 | 28.3 | 0.0413 | 0.0893 | 0.8855 | 0.8916 | 1077.0 | 1208.0 | 0.8703 | 0.8681 | 0.8691 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
1.2731 | 8.0 | 56 | 1.2338 | 0.2802 | 0.096 | 0.2204 | 0.2221 | 28.26 | 0.0406 | 0.0893 | 0.8753 | 0.8825 | 1066.0 | 1208.0 | 0.8729 | 0.8705 | 0.8717 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
1.1977 | 9.0 | 63 | 1.2179 | 0.2834 | 0.0991 | 0.2233 | 0.2244 | 28.42 | 0.0409 | 0.0919 | 0.8725 | 0.88 | 1063.0 | 1208.0 | 0.8745 | 0.8722 | 0.8733 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
1.1717 | 10.0 | 70 | 1.2142 | 0.2852 | 0.0966 | 0.2231 | 0.2243 | 28.38 | 0.0405 | 0.0919 | 0.8771 | 0.8841 | 1068.0 | 1208.0 | 0.8739 | 0.8718 | 0.8728 | roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1) |
Framework versions
- PEFT 0.15.2
- Transformers 4.53.1
- Pytorch 2.7.0+cu126
- Datasets 3.6.0
- Tokenizers 0.21.1