LED_sum_outcome

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.4730
Rouge1: 0.3601
Rouge2: 0.1515
Rougel: 0.3017
Rougelsum: 0.301
Gen Len: 20.36
Bleu: 0.0595
Precisions: 0.1544
Brevity Penalty: 0.6108
Length Ratio: 0.6698
Translation Length: 785.0
Reference Length: 1172.0
Precision: 0.8997
Recall: 0.8766
F1: 0.8879
Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Precision	Recall	F1	Hashcode
No log	1.0	7	7.7628	0.2668	0.0617	0.2177	0.2179	21.0	0.0211	0.0856	0.6935	0.7321	858.0	1172.0	0.8742	0.86	0.8669	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	2.0	14	6.5648	0.3427	0.124	0.2806	0.2804	20.16	0.0514	0.1396	0.6085	0.6681	783.0	1172.0	0.8991	0.8717	0.8851	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	3.0	21	5.1851	0.3468	0.1383	0.282	0.2807	19.7	0.0722	0.1711	0.578	0.6459	757.0	1172.0	0.9029	0.8772	0.8898	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	4.0	28	4.4398	0.3475	0.1299	0.2825	0.2821	20.18	0.0455	0.1417	0.598	0.6604	774.0	1172.0	0.8979	0.8766	0.887	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	5.0	35	4.0655	0.3506	0.1412	0.2913	0.2901	19.94	0.054	0.1556	0.5685	0.6391	749.0	1172.0	0.8987	0.8772	0.8877	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	6.0	42	3.8299	0.356	0.148	0.295	0.294	20.38	0.0616	0.1566	0.6073	0.6672	782.0	1172.0	0.9002	0.8781	0.8889	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	7.0	49	3.6727	0.3645	0.144	0.2966	0.296	20.38	0.0637	0.1593	0.6108	0.6698	785.0	1172.0	0.8987	0.8774	0.8878	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	8.0	56	3.5737	0.3586	0.146	0.2948	0.2941	20.44	0.0632	0.1563	0.6201	0.6766	793.0	1172.0	0.8965	0.8762	0.8862	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	9.0	63	3.5072	0.3568	0.1486	0.2976	0.2963	20.36	0.0598	0.1536	0.6189	0.6758	792.0	1172.0	0.8986	0.8769	0.8875	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log	10.0	70	3.4730	0.3601	0.1515	0.3017	0.301	20.36	0.0595	0.1544	0.6108	0.6698	785.0	1172.0	0.8997	0.8766	0.8879	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)

Framework versions

Transformers 4.53.0
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

floflodebilbao
/

LED_sum_outcome

LED_sum_outcome

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for floflodebilbao/LED_sum_outcome

Evaluation results