LEDv3_ACLsum_all_aspects

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2405
  • Rouge1: 0.3564
  • Rouge2: 0.1444
  • Rougel: 0.296
  • Rougelsum: 0.2951
  • Gen Len: 20.96
  • Bleu: 0.0646
  • Precisions: 0.1586
  • Brevity Penalty: 0.5945
  • Length Ratio: 0.6579
  • Translation Length: 2369.0
  • Reference Length: 3601.0
  • Precision: 0.892
  • Recall: 0.8776
  • F1: 0.8846
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
No log 1.0 19 4.0849 0.2835 0.0826 0.2239 0.2243 20.5333 0.0374 0.105 0.6263 0.6812 2453.0 3601.0 0.8837 0.8674 0.8754 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 2.0 38 3.1922 0.2787 0.0853 0.2229 0.2223 20.7667 0.0396 0.1061 0.6161 0.6737 2426.0 3601.0 0.8767 0.8646 0.8705 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 3.0 57 2.8716 0.292 0.098 0.2328 0.2329 20.84 0.0461 0.1181 0.6082 0.6679 2405.0 3601.0 0.8812 0.8688 0.8749 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 4.0 76 2.6529 0.3166 0.1192 0.2573 0.2566 20.92 0.0576 0.1366 0.6104 0.6695 2411.0 3601.0 0.8861 0.8725 0.8792 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 5.0 95 2.5101 0.3441 0.1353 0.282 0.2813 20.94 0.0613 0.1495 0.593 0.6568 2365.0 3601.0 0.8895 0.8763 0.8828 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 6.0 114 2.3985 0.3501 0.1415 0.2909 0.2912 20.92 0.0616 0.1514 0.5983 0.6606 2379.0 3601.0 0.8913 0.8771 0.884 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 7.0 133 2.3215 0.3557 0.1398 0.295 0.2949 20.9667 0.0608 0.1545 0.593 0.6568 2365.0 3601.0 0.8919 0.8775 0.8846 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 8.0 152 2.2783 0.3494 0.1417 0.2922 0.2918 20.9333 0.0637 0.1561 0.588 0.6532 2352.0 3601.0 0.8907 0.8769 0.8837 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 9.0 171 2.2467 0.3566 0.145 0.297 0.2968 20.96 0.0649 0.1591 0.5926 0.6565 2364.0 3601.0 0.8921 0.8775 0.8846 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)
No log 10.0 190 2.2405 0.3564 0.1444 0.296 0.2951 20.96 0.0646 0.1586 0.5945 0.6579 2369.0 3601.0 0.892 0.8776 0.8846 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.1)

Framework versions

  • Transformers 4.53.1
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
5
Safetensors
Model size
162M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for floflodebilbao/LEDv3_ACLsum_all_aspects

Finetuned
(38)
this model