Shakespeare-T5-small

This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8180

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 15
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.5071 0.0978 100 2.3866
1.924 0.1957 200 1.7508
1.3181 0.2935 300 1.4820
1.4516 0.3914 400 1.3604
1.3453 0.4892 500 1.2847
1.2216 0.5871 600 1.2377
1.5336 0.6849 700 1.2089
1.1803 0.7828 800 1.1853
1.352 0.8806 900 1.1659
1.25 0.9785 1000 1.1453
1.1877 1.0763 1100 1.1323
1.1111 1.1742 1200 1.1171
1.2295 1.2720 1300 1.1074
1.1602 1.3699 1400 1.0963
1.0951 1.4677 1500 1.0852
1.1102 1.5656 1600 1.0757
0.9648 1.6634 1700 1.0671
0.9534 1.7613 1800 1.0591
1.1413 1.8591 1900 1.0526
1.1051 1.9569 2000 1.0470
1.1152 2.0548 2100 1.0410
1.233 2.1526 2200 1.0367
0.9705 2.2505 2300 1.0316
0.8814 2.3483 2400 1.0250
0.967 2.4462 2500 1.0208
1.0735 2.5440 2600 1.0168
1.1725 2.6419 2700 1.0098
1.0661 2.7397 2800 1.0073
1.0255 2.8376 2900 1.0022
1.0509 2.9354 3000 0.9992
1.087 3.0333 3100 0.9942
1.0105 3.1311 3200 0.9897
1.0746 3.2290 3300 0.9860
1.0384 3.3268 3400 0.9833
1.0473 3.4247 3500 0.9791
1.0041 3.5225 3600 0.9744
1.0455 3.6204 3700 0.9713
1.0431 3.7182 3800 0.9675
0.8924 3.8160 3900 0.9645
0.8299 3.9139 4000 0.9611
0.8729 4.0117 4100 0.9583
0.8513 4.1096 4200 0.9556
1.0433 4.2074 4300 0.9512
0.9206 4.3053 4400 0.9491
0.9408 4.4031 4500 0.9461
0.9793 4.5010 4600 0.9433
0.8709 4.5988 4700 0.9402
0.9612 4.6967 4800 0.9370
1.0882 4.7945 4900 0.9346
0.9773 4.8924 5000 0.9331
0.9365 4.9902 5100 0.9302
0.7646 5.0881 5200 0.9276
1.0583 5.1859 5300 0.9246
0.9999 5.2838 5400 0.9223
0.9926 5.3816 5500 0.9207
0.8495 5.4795 5600 0.9194
1.0125 5.5773 5700 0.9163
0.858 5.6751 5800 0.9145
0.9271 5.7730 5900 0.9121
0.9491 5.8708 6000 0.9111
0.998 5.9687 6100 0.9079
1.0376 6.0665 6200 0.9048
0.9302 6.1644 6300 0.9037
0.8533 6.2622 6400 0.9017
0.9285 6.3601 6500 0.8983
0.8977 6.4579 6600 0.8961
0.9745 6.5558 6700 0.8953
0.9405 6.6536 6800 0.8925
0.9929 6.7515 6900 0.8917
0.7314 6.8493 7000 0.8897
0.8764 6.9472 7100 0.8875
0.8443 7.0450 7200 0.8863
0.7612 7.1429 7300 0.8847
1.0069 7.2407 7400 0.8829
1.0436 7.3386 7500 0.8816
0.8622 7.4364 7600 0.8812
0.8886 7.5342 7700 0.8783
0.9365 7.6321 7800 0.8763
0.8307 7.7299 7900 0.8749
0.7874 7.8278 8000 0.8731
0.8606 7.9256 8100 0.8711
0.9094 8.0235 8200 0.8698
0.8108 8.1213 8300 0.8683
1.1008 8.2192 8400 0.8665
0.8168 8.3170 8500 0.8642
0.9125 8.4149 8600 0.8635
0.9218 8.5127 8700 0.8615
0.9688 8.6106 8800 0.8596
0.8679 8.7084 8900 0.8582
0.7928 8.8063 9000 0.8574
0.7947 8.9041 9100 0.8561
0.9055 9.0020 9200 0.8554
1.0261 9.0998 9300 0.8538
0.8588 9.1977 9400 0.8530
0.8789 9.2955 9500 0.8526
0.9111 9.3933 9600 0.8511
0.7969 9.4912 9700 0.8503
0.8683 9.5890 9800 0.8492
0.8884 9.6869 9900 0.8479
0.9843 9.7847 10000 0.8464
0.7575 9.8826 10100 0.8449
0.887 9.9804 10200 0.8441
0.9358 10.0783 10300 0.8428
0.7544 10.1761 10400 0.8422
0.7679 10.2740 10500 0.8411
0.7898 10.3718 10600 0.8406
0.8641 10.4697 10700 0.8397
0.9489 10.5675 10800 0.8385
0.8384 10.6654 10900 0.8380
0.8114 10.7632 11000 0.8372
0.8874 10.8611 11100 0.8361
0.7843 10.9589 11200 0.8349
0.7065 11.0568 11300 0.8343
0.803 11.1546 11400 0.8339
0.827 11.2524 11500 0.8328
0.8149 11.3503 11600 0.8319
0.8248 11.4481 11700 0.8314
0.7901 11.5460 11800 0.8308
0.7693 11.6438 11900 0.8301
0.8768 11.7417 12000 0.8293
0.7997 11.8395 12100 0.8292
0.8565 11.9374 12200 0.8282
0.7635 12.0352 12300 0.8274
0.7206 12.1331 12400 0.8265
0.8557 12.2309 12500 0.8263
0.7447 12.3288 12600 0.8256
0.9202 12.4266 12700 0.8248
0.9656 12.5245 12800 0.8242
0.8597 12.6223 12900 0.8236
0.7596 12.7202 13000 0.8231
0.7913 12.8180 13100 0.8226
0.7493 12.9159 13200 0.8223
0.6831 13.0137 13300 0.8219
0.7939 13.1115 13400 0.8217
0.7227 13.2094 13500 0.8215
0.8662 13.3072 13600 0.8212
0.686 13.4051 13700 0.8210
0.7837 13.5029 13800 0.8206
0.8727 13.6008 13900 0.8202
0.7521 13.6986 14000 0.8199
0.7101 13.7965 14100 0.8196
0.9385 13.8943 14200 0.8193
0.7965 13.9922 14300 0.8191
0.8829 14.0900 14400 0.8188
0.8408 14.1879 14500 0.8186
0.8808 14.2857 14600 0.8185
0.8918 14.3836 14700 0.8184
0.756 14.4814 14800 0.8182
0.8249 14.5793 14900 0.8181
0.721 14.6771 15000 0.8181
0.8631 14.7750 15100 0.8180
0.789 14.8728 15200 0.8180
0.6724 14.9706 15300 0.8180

Framework versions

  • Transformers 4.52.3
  • Pytorch 2.6.0+cu124
  • Datasets 2.14.4
  • Tokenizers 0.21.1
Downloads last month
69
Safetensors
Model size
60.5M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Hananguyen12/Shakespeare-T5-small

Base model

google-t5/t5-small
Finetuned
(1991)
this model