outputs

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4804
  • Wer: 0.3867
  • Cer: 0.1484

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.25
  • num_epochs: 60.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
0.126 1.0 1080 0.4804 0.3867 0.1485
0.1441 2.0 2160 0.6097 0.4424 0.1950
0.1675 3.0 3240 0.5237 0.4448 0.1676
0.1919 4.0 4320 0.6256 0.4844 0.1884
0.2168 5.0 5400 0.6817 0.5131 0.1992
0.2411 6.0 6480 0.6816 0.5234 0.2041
0.2493 7.0 7560 0.8295 0.6788 0.2559
0.2718 8.0 8640 0.8849 0.6757 0.2669
0.2922 9.0 9720 1.0527 0.6722 0.3401
0.3156 10.0 10800 1.0661 0.7528 0.3576
0.3273 11.0 11880 1.0083 0.7841 0.2930
0.3216 12.0 12960 1.1305 0.7282 0.3154
0.3498 13.0 14040 1.0759 0.7312 0.3106
0.3553 14.0 15120 0.8732 0.6757 0.2803
0.3582 15.0 16200 1.0551 0.7623 0.3185
0.3607 16.0 17280 1.0535 0.7483 0.3101
0.3447 17.0 18360 1.0640 0.7369 0.3081
0.325 18.0 19440 1.0327 0.7535 0.2905
0.3022 19.0 20520 0.9870 0.7232 0.2887
0.2825 20.0 21600 0.9183 0.6864 0.2806
0.2706 21.0 22680 0.9366 0.6812 0.2860
0.2507 22.0 23760 0.9585 0.6941 0.2608
0.237 23.0 24840 1.0100 0.6798 0.2802
0.2298 24.0 25920 0.9185 0.6349 0.2449
0.221 25.0 27000 0.9353 0.6580 0.2785
0.2052 26.0 28080 0.8652 0.6493 0.2507
0.1928 27.0 29160 0.8859 0.6776 0.2631
0.1889 28.0 30240 0.9240 0.6637 0.2666
0.1771 29.0 31320 0.9043 0.6256 0.2493
0.163 30.0 32400 0.9131 0.6504 0.2621
0.1603 31.0 33480 0.8102 0.6319 0.2406
0.1447 32.0 34560 0.9245 0.6337 0.2448
0.1418 33.0 35640 0.9590 0.6236 0.2530
0.1415 34.0 36720 0.9275 0.6345 0.2579
0.1313 35.0 37800 0.8644 0.6280 0.2498
0.1285 36.0 38880 0.9071 0.625 0.2651
0.1204 37.0 39960 0.8658 0.6092 0.2387
0.1116 38.0 41040 0.8684 0.6267 0.2459
0.102 39.0 42120 0.9792 0.6245 0.2410
0.0966 40.0 43200 0.8881 0.6163 0.2466
0.0934 41.0 44280 0.8669 0.5971 0.2340
0.0847 42.0 45360 0.9718 0.6207 0.2371
0.0828 43.0 46440 0.9573 0.6223 0.2393
0.0727 44.0 47520 0.9872 0.6097 0.2358
0.0701 45.0 48600 0.9421 0.6116 0.2446
0.0648 46.0 49680 0.9591 0.6043 0.2467
0.0634 47.0 50760 0.9991 0.6110 0.2355
0.0573 48.0 51840 0.9873 0.6054 0.2345
0.0527 49.0 52920 0.9886 0.5936 0.2325
0.0506 50.0 54000 1.0199 0.5941 0.2287
0.0486 51.0 55080 1.0691 0.5881 0.2263
0.0447 52.0 56160 1.0141 0.5893 0.2296
0.0419 53.0 57240 1.0658 0.5873 0.2279
0.0376 54.0 58320 1.1441 0.5889 0.2254
0.0355 55.0 59400 1.1462 0.5881 0.2249
0.0335 56.0 60480 1.1712 0.5860 0.2244
0.0296 57.0 61560 1.1622 0.5786 0.2218
0.0301 58.0 62640 1.1704 0.5840 0.2235
0.0283 59.0 63720 1.1973 0.5805 0.2213
0.0245 60.0 64800 1.1908 0.5762 0.2198

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
94.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support