Lora_LED_sum_outcome

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 3.7652
Rouge1: 0.3745
Rouge2: 0.1523
Rougel: 0.3056
Rougelsum: 0.3055
Gen Len: 25.92
Bleu: 0.0641
Precisions: 0.1432
Brevity Penalty: 0.7677
Length Ratio: 0.791
Translation Length: 927.0
Reference Length: 1172.0
Precision: 0.8956
Recall: 0.8832
F1: 0.8892
Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 1
eval_batch_size: 1
seed: 42
gradient_accumulation_steps: 16
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Rouge1	Rouge2	Rougel	Rougelsum	Gen Len	Bleu	Precisions	Brevity Penalty	Length Ratio	Translation Length	Reference Length	Precision	Recall	F1	Hashcode
8.3925	1.0	7	8.0434	0.2138	0.043	0.174	0.1733	32.0	0.0223	0.0541	1.0	1.1246	1318.0	1172.0	0.8553	0.8578	0.8564	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
6.8857	2.0	14	6.1890	0.3112	0.0956	0.2638	0.2638	30.08	0.0477	0.0915	1.0	1.0333	1211.0	1172.0	0.8775	0.8694	0.8733	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
5.1856	3.0	21	4.7196	0.3535	0.1502	0.2855	0.2871	23.4	0.0688	0.1552	0.7034	0.7398	867.0	1172.0	0.9014	0.8803	0.8906	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
4.2664	4.0	28	4.1945	0.354	0.1541	0.295	0.2952	23.14	0.0781	0.1725	0.643	0.6937	813.0	1172.0	0.904	0.8821	0.8928	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.898	5.0	35	3.9578	0.3777	0.1653	0.3107	0.3108	25.16	0.0912	0.1705	0.7434	0.7713	904.0	1172.0	0.9005	0.884	0.892	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.6798	6.0	42	3.8411	0.368	0.1556	0.2914	0.2907	23.92	0.0719	0.1621	0.6836	0.7244	849.0	1172.0	0.9039	0.8834	0.8934	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.5743	7.0	49	3.8041	0.3678	0.1445	0.2954	0.2956	27.28	0.0648	0.1358	0.809	0.8251	967.0	1172.0	0.8937	0.883	0.8883	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.5091	8.0	56	3.7772	0.371	0.1559	0.3051	0.3061	26.3	0.0755	0.1511	0.7709	0.7935	930.0	1172.0	0.896	0.8833	0.8895	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.4502	9.0	63	3.7611	0.3633	0.1495	0.3013	0.3013	25.8	0.0653	0.1423	0.7498	0.7765	910.0	1172.0	0.8952	0.881	0.8879	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
3.4484	10.0	70	3.7652	0.3745	0.1523	0.3056	0.3055	25.92	0.0641	0.1432	0.7677	0.791	927.0	1172.0	0.8956	0.8832	0.8892	roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

PEFT 0.15.2
Transformers 4.53.1
Pytorch 2.7.0+cu126
Datasets 3.6.0
Tokenizers 0.21.1

floflodebilbao
/

Lora_LED_sum_outcome

Lora_LED_sum_outcome

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for floflodebilbao/Lora_LED_sum_outcome

Evaluation results