flan-t5-base-samsum / README.md
texasdave2's picture
End of training
6ae0010
|
raw
history blame
4.22 kB
metadata
license: apache-2.0
base_model: google/flan-t5-base
tags:
  - generated_from_trainer
datasets:
  - samsum
metrics:
  - rouge
model-index:
  - name: flan-t5-base-samsum
    results:
      - task:
          name: Sequence-to-sequence Language Modeling
          type: text2text-generation
        dataset:
          name: samsum
          type: samsum
          config: samsum
          split: test
          args: samsum
        metrics:
          - name: Rouge1
            type: rouge
            value: 47.08

flan-t5-base-samsum

This model is a fine-tuned version of google/flan-t5-base on the samsum dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3859
  • Rouge1: 47.08
  • Rouge2: 23.2603
  • Rougel: 39.2645
  • Rougelsum: 43.2898
  • Gen Len: 17.3333

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 24
  • eval_batch_size: 24
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 2

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Rougelsum Gen Len
1.5121 0.08 50 1.4287 46.7443 22.8826 38.9466 42.862 16.9634
1.46 0.16 100 1.4199 46.7723 22.8011 39.0224 42.9095 17.2393
1.4515 0.24 150 1.4147 46.6593 23.027 38.9378 42.8492 17.1245
1.4679 0.33 200 1.4121 46.8312 22.8345 39.1545 43.2035 17.3431
1.451 0.41 250 1.4109 46.826 23.038 39.2744 43.3106 17.2686
1.4434 0.49 300 1.4040 46.6744 23.0221 39.3167 43.1835 16.9158
1.4417 0.57 350 1.4007 46.851 23.0448 39.2346 43.2396 17.1172
1.4781 0.65 400 1.3952 46.7831 23.1146 39.295 43.2256 17.2076
1.4626 0.73 450 1.3940 47.0933 23.2741 39.2954 43.3102 17.2222
1.4307 0.81 500 1.3955 46.8827 23.2016 39.2817 43.2379 17.2002
1.4586 0.9 550 1.3933 46.7152 23.1439 39.2576 43.1754 17.3040
1.4465 0.98 600 1.3905 46.8332 23.3356 39.2596 43.2472 17.3468
1.381 1.06 650 1.3953 46.9289 22.9605 39.0651 43.2085 17.4066
1.4125 1.14 700 1.3922 46.4822 23.0893 38.9024 42.9789 17.2381
1.3667 1.22 750 1.3922 47.2977 23.4064 39.5091 43.5742 17.2930
1.3878 1.3 800 1.3953 46.6405 23.2132 39.2853 43.3049 17.3358
1.3884 1.38 850 1.3931 46.9152 23.1594 39.1629 43.2254 17.3614
1.3766 1.47 900 1.3898 46.988 23.1708 39.2446 43.311 17.3333
1.3727 1.55 950 1.3889 46.6771 23.0915 39.0787 43.0184 17.3211
1.4001 1.63 1000 1.3859 47.08 23.2603 39.2645 43.2898 17.3333
1.3894 1.71 1050 1.3874 47.2134 23.3696 39.4356 43.5422 17.3297
1.3697 1.79 1100 1.3860 47.06 23.3769 39.3494 43.4113 17.3504
1.3886 1.87 1150 1.3862 47.0159 23.3728 39.3871 43.4016 17.3260
1.4037 1.95 1200 1.3861 47.0039 23.4055 39.3356 43.3787 17.3321

Framework versions

  • Transformers 4.33.2
  • Pytorch 2.0.0+cu117
  • Datasets 2.14.5
  • Tokenizers 0.13.3