LED_sum_approach

This model is a fine-tuned version of allenai/led-base-16384 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.3340
  • Rouge1: 0.4569
  • Rouge2: 0.2272
  • Rougel: 0.3918
  • Rougelsum: 0.3927
  • Gen Len: 20.82
  • Bleu: 0.1112
  • Precisions: 0.2401
  • Brevity Penalty: 0.5852
  • Length Ratio: 0.6511
  • Translation Length: 795.0
  • Reference Length: 1221.0
  • Precision: 0.9098
  • Recall: 0.8906
  • F1: 0.9
  • Hashcode: roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 16
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 10
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len Bleu Precisions Brevity Penalty Length Ratio Translation Length Reference Length Precision Recall F1 Hashcode
No log 1.0 7 7.6476 0.348 0.1355 0.2757 0.274 21.0 0.0602 0.1436 0.6232 0.679 829.0 1221.0 0.8935 0.8759 0.8845 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 2.0 14 6.4676 0.4218 0.2049 0.3597 0.3592 20.94 0.1008 0.2063 0.6419 0.6929 846.0 1221.0 0.9027 0.8839 0.8932 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 3.0 21 5.0145 0.4189 0.2067 0.362 0.3612 20.5 0.0945 0.2152 0.5919 0.656 801.0 1221.0 0.9087 0.8846 0.8964 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 4.0 28 4.2719 0.44 0.2299 0.3791 0.3778 20.48 0.1052 0.2337 0.5852 0.6511 795.0 1221.0 0.9087 0.8882 0.8982 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 5.0 35 3.9222 0.4538 0.238 0.3919 0.3917 20.7 0.1062 0.2404 0.5795 0.647 790.0 1221.0 0.9126 0.891 0.9016 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 6.0 42 3.6730 0.4582 0.2266 0.3926 0.3922 20.82 0.1093 0.236 0.5908 0.6552 800.0 1221.0 0.9099 0.8895 0.8995 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 7.0 49 3.5177 0.4639 0.2385 0.4037 0.4033 20.76 0.117 0.2484 0.5863 0.6519 796.0 1221.0 0.9101 0.8901 0.8999 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 8.0 56 3.4234 0.4564 0.2345 0.398 0.3978 20.72 0.1148 0.247 0.5806 0.6478 791.0 1221.0 0.9094 0.8891 0.899 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 9.0 63 3.3645 0.4518 0.2273 0.3912 0.3913 20.82 0.1111 0.2376 0.5886 0.6536 798.0 1221.0 0.9087 0.8899 0.8991 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)
No log 10.0 70 3.3340 0.4569 0.2272 0.3918 0.3927 20.82 0.1112 0.2401 0.5852 0.6511 795.0 1221.0 0.9098 0.8906 0.9 roberta-large_L17_no-idf_version=0.3.12(hug_trans=4.53.0)

Framework versions

  • Transformers 4.53.0
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
162M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for floflodebilbao/LED_sum_approach

Finetuned
(37)
this model