Lora_long_T5_sum_outcome

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 1.1051
Rouge1: 0.3789
Rouge2: 0.1817
Rougel: 0.3238
Rougelsum: 0.3256
Gen Len: 27.8
Bleu: 0.0865
Precisions: 0.1534
Brevity Penalty: 0.8221
Length Ratio: 0.8362
Translation Length: 980.0
Reference Length: 1172.0
Precision: 0.8937
Recall: 0.8862
F1: 0.8898
Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.002
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Precision	Recall	F1	Hashcode
22.0917	1.0	7	5.3855	0.0468	0.0056	0.0416	0.0415	31.0	0.0	0.016	0.5803	0.6476	759.0	1172.0	0.7506	0.8197	0.7828	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.5733	2.0	14	4.6730	0.1909	0.0287	0.1473	0.1475	30.88	0.0179	0.0488	0.8856	0.8916	1045.0	1172.0	0.8418	0.8462	0.844	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.2163	3.0	21	3.6942	0.2295	0.0424	0.1634	0.1642	29.08	0.0264	0.0695	0.8469	0.8575	1005.0	1172.0	0.8546	0.8582	0.8563	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.5683	4.0	28	3.1688	0.2805	0.0846	0.2121	0.2134	28.98	0.0383	0.0906	0.8469	0.8575	1005.0	1172.0	0.8681	0.8666	0.8672	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.2672	5.0	35	2.8633	0.325	0.1351	0.2652	0.2669	28.4	0.0652	0.1242	0.8341	0.8464	992.0	1172.0	0.8823	0.8776	0.8799	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.0146	6.0	42	2.4207	0.3326	0.1431	0.2839	0.2856	28.08	0.0788	0.1344	0.839	0.8507	997.0	1172.0	0.8839	0.879	0.8813	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.4539	7.0	49	1.7916	0.3471	0.1565	0.2932	0.2931	28.26	0.0882	0.1431	0.839	0.8507	997.0	1172.0	0.8863	0.882	0.884	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.965	8.0	56	1.3215	0.3607	0.1749	0.3113	0.3125	28.18	0.0925	0.1498	0.8331	0.8456	991.0	1172.0	0.889	0.8839	0.8863	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.7658	9.0	63	1.1630	0.3772	0.1782	0.3211	0.3228	27.8	0.0838	0.1518	0.813	0.8285	971.0	1172.0	0.8937	0.8859	0.8897	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.5019	10.0	70	1.1051	0.3789	0.1817	0.3238	0.3256	27.8	0.0865	0.1534	0.8221	0.8362	980.0	1172.0	0.8937	0.8862	0.8898	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

PEFT 0.15.2
Transformers 4.53.1
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

floflodebilbao
/

Lora_long_T5_sum_outcome

Lora_long_T5_sum_outcome

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for floflodebilbao/Lora_long_T5_sum_outcome

Evaluation results