Lora_long_T5_sum_approach

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.8610
Rouge1: 0.4804
Rouge2: 0.2605
Rougel: 0.4126
Rougelsum: 0.4141
Gen Len: 28.18
Bleu: 0.1536
Precisions: 0.2428
Brevity Penalty: 0.772
Length Ratio: 0.7944
Translation Length: 970.0
Reference Length: 1221.0
Precision: 0.914
Recall: 0.9024
F1: 0.9081
Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.002
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Precision	Recall	F1	Hashcode
20.3106	1.0	7	4.7803	0.0666	0.0125	0.0566	0.057	31.0	0.0062	0.0248	0.5139	0.6003	733.0	1221.0	0.7656	0.817	0.79	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.3299	2.0	14	4.0381	0.3252	0.1232	0.2298	0.2295	30.3	0.0656	0.1136	0.8066	0.8231	1005.0	1221.0	0.8607	0.8665	0.8635	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.881	3.0	21	3.2332	0.3357	0.141	0.2638	0.2643	28.78	0.0835	0.1391	0.8086	0.8247	1007.0	1221.0	0.8722	0.8719	0.872	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.138	4.0	28	2.8019	0.3883	0.1806	0.3285	0.3283	29.14	0.0964	0.1631	0.7978	0.8157	996.0	1221.0	0.8856	0.8835	0.8845	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.6873	5.0	35	2.2161	0.452	0.2271	0.3854	0.3859	27.96	0.1276	0.2114	0.781	0.8018	979.0	1221.0	0.9067	0.8967	0.9016	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
2.0184	6.0	42	1.3080	0.463	0.2487	0.4009	0.4028	27.62	0.1481	0.239	0.764	0.7879	962.0	1221.0	0.9111	0.8991	0.9049	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.3413	7.0	49	0.9692	0.4678	0.2529	0.401	0.4025	28.06	0.1473	0.2354	0.773	0.7952	971.0	1221.0	0.9109	0.8996	0.9051	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
1.0888	8.0	56	0.8996	0.4784	0.259	0.4102	0.4118	28.2	0.1468	0.2363	0.775	0.7969	973.0	1221.0	0.9126	0.9013	0.9068	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
0.9722	9.0	63	0.8690	0.4824	0.262	0.4112	0.4129	28.22	0.1523	0.2416	0.776	0.7977	974.0	1221.0	0.9131	0.9019	0.9074	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
0.948	10.0	70	0.8610	0.4804	0.2605	0.4126	0.4141	28.18	0.1536	0.2428	0.772	0.7944	970.0	1221.0	0.914	0.9024	0.9081	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

PEFT 0.15.2
Transformers 4.53.1
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

floflodebilbao
/

Lora_long_T5_sum_approach

Lora_long_T5_sum_approach

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for floflodebilbao/Lora_long_T5_sum_approach

Evaluation results