long_T5_sum_approach

This model is a fine-tuned version of google/long-t5-tglobal-base on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 5.9110
Rouge1: 0.1621
Rouge2: 0.032
Rougel: 0.1254
Rougelsum: 0.126
Gen Len: 20.0
Bleu: 0.0
Precisions: 0.051
Brevity Penalty: 0.5104
Length Ratio: 0.5979
Translation Length: 730.0
Reference Length: 1221.0
Precision: 0.84
Recall: 0.8425
F1: 0.8411
Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 12

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Precision	Recall	F1	Hashcode
No log	1.0	7	25.6156	0.2003	0.0566	0.1655	0.1653	20.0	0.0197	0.072	0.5371	0.6167	753.0	1221.0	0.8574	0.8505	0.8538	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	2.0	14	23.0400	0.1953	0.0521	0.1616	0.1611	20.0	0.0186	0.0701	0.5302	0.6118	747.0	1221.0	0.8566	0.8497	0.8531	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	3.0	21	20.9401	0.1905	0.0494	0.157	0.1574	20.0	0.0145	0.0629	0.5394	0.6183	755.0	1221.0	0.8581	0.8507	0.8543	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	4.0	28	19.2980	0.1983	0.0521	0.1626	0.1635	20.0	0.0157	0.0682	0.5337	0.6143	750.0	1221.0	0.8595	0.8511	0.8552	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	5.0	35	17.7768	0.2064	0.0605	0.1704	0.1719	20.0	0.0217	0.0758	0.5302	0.6118	747.0	1221.0	0.8612	0.8524	0.8567	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	6.0	42	16.2705	0.2071	0.0636	0.1691	0.1699	20.0	0.0273	0.0818	0.5313	0.6126	748.0	1221.0	0.8599	0.851	0.8553	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	7.0	49	14.6608	0.2018	0.0607	0.1676	0.1689	20.0	0.0263	0.0797	0.5244	0.6077	742.0	1221.0	0.8575	0.85	0.8536	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	8.0	56	12.7872	0.1914	0.0533	0.1564	0.1566	20.0	0.0263	0.0771	0.5267	0.6093	744.0	1221.0	0.8528	0.8469	0.8498	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	9.0	63	10.6116	0.2038	0.0562	0.1631	0.1637	20.0	0.0295	0.0836	0.5232	0.6069	741.0	1221.0	0.8542	0.8486	0.8513	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	10.0	70	8.3062	0.1963	0.0497	0.1559	0.1558	20.0	0.0244	0.0747	0.5244	0.6077	742.0	1221.0	0.8504	0.8472	0.8487	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	11.0	77	6.5339	0.1794	0.0401	0.1404	0.1411	20.0	0.0182	0.0649	0.5186	0.6036	737.0	1221.0	0.8448	0.8443	0.8444	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log	12.0	84	5.9110	0.1621	0.032	0.1254	0.126	20.0	0.0	0.051	0.5104	0.5979	730.0	1221.0	0.84	0.8425	0.8411	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

Transformers 4.53.1
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

floflodebilbao
/

long_T5_sum_approach

long_T5_sum_approach

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for floflodebilbao/long_T5_sum_approach

Evaluation results