lyrics-distilgpt2

This model is a fine-tuned version of distilgpt2 on the smgriffin/modern-pop-lyrics dataset. It achieves the following results on the evaluation set:

  • Loss: 1.4199

Model description

This model is a fine-tuned version of distilgpt2 meant to generate small samples of pop song lyrics.

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 3
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.7005 0.1164 500 1.6527
1.665 0.2329 1000 1.6200
1.5971 0.3493 1500 1.5926
1.5971 0.4658 2000 1.5686
1.6219 0.5822 2500 1.5518
1.5498 0.6986 3000 1.5366
1.5365 0.8151 3500 1.5224
1.5884 0.9315 4000 1.5079
1.5592 1.0480 4500 1.4981
1.4967 1.1644 5000 1.4903
1.5201 1.2809 5500 1.4820
1.5183 1.3973 6000 1.4731
1.552 1.5137 6500 1.4663
1.5109 1.6302 7000 1.4597
1.4942 1.7466 7500 1.4538
1.4798 1.8631 8000 1.4464
1.5316 1.9795 8500 1.4422
1.4407 2.0959 9000 1.4381
1.424 2.2124 9500 1.4346
1.4886 2.3288 10000 1.4299
1.3938 2.4453 10500 1.4279
1.4472 2.5617 11000 1.4250
1.4942 2.6782 11500 1.4232
1.481 2.7946 12000 1.4201
1.4804 2.9110 12500 1.4199

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.1
  • Tokenizers 0.21.1
Downloads last month
6
Safetensors
Model size
81.9M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for masaharustin/lyrics-distilgpt2

Finetuned
(750)
this model

Dataset used to train masaharustin/lyrics-distilgpt2