Shakespeare-T5-small
This model is a fine-tuned version of t5-small on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8180
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 15
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
2.5071 | 0.0978 | 100 | 2.3866 |
1.924 | 0.1957 | 200 | 1.7508 |
1.3181 | 0.2935 | 300 | 1.4820 |
1.4516 | 0.3914 | 400 | 1.3604 |
1.3453 | 0.4892 | 500 | 1.2847 |
1.2216 | 0.5871 | 600 | 1.2377 |
1.5336 | 0.6849 | 700 | 1.2089 |
1.1803 | 0.7828 | 800 | 1.1853 |
1.352 | 0.8806 | 900 | 1.1659 |
1.25 | 0.9785 | 1000 | 1.1453 |
1.1877 | 1.0763 | 1100 | 1.1323 |
1.1111 | 1.1742 | 1200 | 1.1171 |
1.2295 | 1.2720 | 1300 | 1.1074 |
1.1602 | 1.3699 | 1400 | 1.0963 |
1.0951 | 1.4677 | 1500 | 1.0852 |
1.1102 | 1.5656 | 1600 | 1.0757 |
0.9648 | 1.6634 | 1700 | 1.0671 |
0.9534 | 1.7613 | 1800 | 1.0591 |
1.1413 | 1.8591 | 1900 | 1.0526 |
1.1051 | 1.9569 | 2000 | 1.0470 |
1.1152 | 2.0548 | 2100 | 1.0410 |
1.233 | 2.1526 | 2200 | 1.0367 |
0.9705 | 2.2505 | 2300 | 1.0316 |
0.8814 | 2.3483 | 2400 | 1.0250 |
0.967 | 2.4462 | 2500 | 1.0208 |
1.0735 | 2.5440 | 2600 | 1.0168 |
1.1725 | 2.6419 | 2700 | 1.0098 |
1.0661 | 2.7397 | 2800 | 1.0073 |
1.0255 | 2.8376 | 2900 | 1.0022 |
1.0509 | 2.9354 | 3000 | 0.9992 |
1.087 | 3.0333 | 3100 | 0.9942 |
1.0105 | 3.1311 | 3200 | 0.9897 |
1.0746 | 3.2290 | 3300 | 0.9860 |
1.0384 | 3.3268 | 3400 | 0.9833 |
1.0473 | 3.4247 | 3500 | 0.9791 |
1.0041 | 3.5225 | 3600 | 0.9744 |
1.0455 | 3.6204 | 3700 | 0.9713 |
1.0431 | 3.7182 | 3800 | 0.9675 |
0.8924 | 3.8160 | 3900 | 0.9645 |
0.8299 | 3.9139 | 4000 | 0.9611 |
0.8729 | 4.0117 | 4100 | 0.9583 |
0.8513 | 4.1096 | 4200 | 0.9556 |
1.0433 | 4.2074 | 4300 | 0.9512 |
0.9206 | 4.3053 | 4400 | 0.9491 |
0.9408 | 4.4031 | 4500 | 0.9461 |
0.9793 | 4.5010 | 4600 | 0.9433 |
0.8709 | 4.5988 | 4700 | 0.9402 |
0.9612 | 4.6967 | 4800 | 0.9370 |
1.0882 | 4.7945 | 4900 | 0.9346 |
0.9773 | 4.8924 | 5000 | 0.9331 |
0.9365 | 4.9902 | 5100 | 0.9302 |
0.7646 | 5.0881 | 5200 | 0.9276 |
1.0583 | 5.1859 | 5300 | 0.9246 |
0.9999 | 5.2838 | 5400 | 0.9223 |
0.9926 | 5.3816 | 5500 | 0.9207 |
0.8495 | 5.4795 | 5600 | 0.9194 |
1.0125 | 5.5773 | 5700 | 0.9163 |
0.858 | 5.6751 | 5800 | 0.9145 |
0.9271 | 5.7730 | 5900 | 0.9121 |
0.9491 | 5.8708 | 6000 | 0.9111 |
0.998 | 5.9687 | 6100 | 0.9079 |
1.0376 | 6.0665 | 6200 | 0.9048 |
0.9302 | 6.1644 | 6300 | 0.9037 |
0.8533 | 6.2622 | 6400 | 0.9017 |
0.9285 | 6.3601 | 6500 | 0.8983 |
0.8977 | 6.4579 | 6600 | 0.8961 |
0.9745 | 6.5558 | 6700 | 0.8953 |
0.9405 | 6.6536 | 6800 | 0.8925 |
0.9929 | 6.7515 | 6900 | 0.8917 |
0.7314 | 6.8493 | 7000 | 0.8897 |
0.8764 | 6.9472 | 7100 | 0.8875 |
0.8443 | 7.0450 | 7200 | 0.8863 |
0.7612 | 7.1429 | 7300 | 0.8847 |
1.0069 | 7.2407 | 7400 | 0.8829 |
1.0436 | 7.3386 | 7500 | 0.8816 |
0.8622 | 7.4364 | 7600 | 0.8812 |
0.8886 | 7.5342 | 7700 | 0.8783 |
0.9365 | 7.6321 | 7800 | 0.8763 |
0.8307 | 7.7299 | 7900 | 0.8749 |
0.7874 | 7.8278 | 8000 | 0.8731 |
0.8606 | 7.9256 | 8100 | 0.8711 |
0.9094 | 8.0235 | 8200 | 0.8698 |
0.8108 | 8.1213 | 8300 | 0.8683 |
1.1008 | 8.2192 | 8400 | 0.8665 |
0.8168 | 8.3170 | 8500 | 0.8642 |
0.9125 | 8.4149 | 8600 | 0.8635 |
0.9218 | 8.5127 | 8700 | 0.8615 |
0.9688 | 8.6106 | 8800 | 0.8596 |
0.8679 | 8.7084 | 8900 | 0.8582 |
0.7928 | 8.8063 | 9000 | 0.8574 |
0.7947 | 8.9041 | 9100 | 0.8561 |
0.9055 | 9.0020 | 9200 | 0.8554 |
1.0261 | 9.0998 | 9300 | 0.8538 |
0.8588 | 9.1977 | 9400 | 0.8530 |
0.8789 | 9.2955 | 9500 | 0.8526 |
0.9111 | 9.3933 | 9600 | 0.8511 |
0.7969 | 9.4912 | 9700 | 0.8503 |
0.8683 | 9.5890 | 9800 | 0.8492 |
0.8884 | 9.6869 | 9900 | 0.8479 |
0.9843 | 9.7847 | 10000 | 0.8464 |
0.7575 | 9.8826 | 10100 | 0.8449 |
0.887 | 9.9804 | 10200 | 0.8441 |
0.9358 | 10.0783 | 10300 | 0.8428 |
0.7544 | 10.1761 | 10400 | 0.8422 |
0.7679 | 10.2740 | 10500 | 0.8411 |
0.7898 | 10.3718 | 10600 | 0.8406 |
0.8641 | 10.4697 | 10700 | 0.8397 |
0.9489 | 10.5675 | 10800 | 0.8385 |
0.8384 | 10.6654 | 10900 | 0.8380 |
0.8114 | 10.7632 | 11000 | 0.8372 |
0.8874 | 10.8611 | 11100 | 0.8361 |
0.7843 | 10.9589 | 11200 | 0.8349 |
0.7065 | 11.0568 | 11300 | 0.8343 |
0.803 | 11.1546 | 11400 | 0.8339 |
0.827 | 11.2524 | 11500 | 0.8328 |
0.8149 | 11.3503 | 11600 | 0.8319 |
0.8248 | 11.4481 | 11700 | 0.8314 |
0.7901 | 11.5460 | 11800 | 0.8308 |
0.7693 | 11.6438 | 11900 | 0.8301 |
0.8768 | 11.7417 | 12000 | 0.8293 |
0.7997 | 11.8395 | 12100 | 0.8292 |
0.8565 | 11.9374 | 12200 | 0.8282 |
0.7635 | 12.0352 | 12300 | 0.8274 |
0.7206 | 12.1331 | 12400 | 0.8265 |
0.8557 | 12.2309 | 12500 | 0.8263 |
0.7447 | 12.3288 | 12600 | 0.8256 |
0.9202 | 12.4266 | 12700 | 0.8248 |
0.9656 | 12.5245 | 12800 | 0.8242 |
0.8597 | 12.6223 | 12900 | 0.8236 |
0.7596 | 12.7202 | 13000 | 0.8231 |
0.7913 | 12.8180 | 13100 | 0.8226 |
0.7493 | 12.9159 | 13200 | 0.8223 |
0.6831 | 13.0137 | 13300 | 0.8219 |
0.7939 | 13.1115 | 13400 | 0.8217 |
0.7227 | 13.2094 | 13500 | 0.8215 |
0.8662 | 13.3072 | 13600 | 0.8212 |
0.686 | 13.4051 | 13700 | 0.8210 |
0.7837 | 13.5029 | 13800 | 0.8206 |
0.8727 | 13.6008 | 13900 | 0.8202 |
0.7521 | 13.6986 | 14000 | 0.8199 |
0.7101 | 13.7965 | 14100 | 0.8196 |
0.9385 | 13.8943 | 14200 | 0.8193 |
0.7965 | 13.9922 | 14300 | 0.8191 |
0.8829 | 14.0900 | 14400 | 0.8188 |
0.8408 | 14.1879 | 14500 | 0.8186 |
0.8808 | 14.2857 | 14600 | 0.8185 |
0.8918 | 14.3836 | 14700 | 0.8184 |
0.756 | 14.4814 | 14800 | 0.8182 |
0.8249 | 14.5793 | 14900 | 0.8181 |
0.721 | 14.6771 | 15000 | 0.8181 |
0.8631 | 14.7750 | 15100 | 0.8180 |
0.789 | 14.8728 | 15200 | 0.8180 |
0.6724 | 14.9706 | 15300 | 0.8180 |
Framework versions
- Transformers 4.52.3
- Pytorch 2.6.0+cu124
- Datasets 2.14.4
- Tokenizers 0.21.1
- Downloads last month
- 69
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for Hananguyen12/Shakespeare-T5-small
Base model
google-t5/t5-small