outputs

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.4804
Wer: 0.3867
Cer: 0.1484

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.25
num_epochs: 60.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
0.126	1.0	1080	0.4804	0.3867	0.1485
0.1441	2.0	2160	0.6097	0.4424	0.1950
0.1675	3.0	3240	0.5237	0.4448	0.1676
0.1919	4.0	4320	0.6256	0.4844	0.1884
0.2168	5.0	5400	0.6817	0.5131	0.1992
0.2411	6.0	6480	0.6816	0.5234	0.2041
0.2493	7.0	7560	0.8295	0.6788	0.2559
0.2718	8.0	8640	0.8849	0.6757	0.2669
0.2922	9.0	9720	1.0527	0.6722	0.3401
0.3156	10.0	10800	1.0661	0.7528	0.3576
0.3273	11.0	11880	1.0083	0.7841	0.2930
0.3216	12.0	12960	1.1305	0.7282	0.3154
0.3498	13.0	14040	1.0759	0.7312	0.3106
0.3553	14.0	15120	0.8732	0.6757	0.2803
0.3582	15.0	16200	1.0551	0.7623	0.3185
0.3607	16.0	17280	1.0535	0.7483	0.3101
0.3447	17.0	18360	1.0640	0.7369	0.3081
0.325	18.0	19440	1.0327	0.7535	0.2905
0.3022	19.0	20520	0.9870	0.7232	0.2887
0.2825	20.0	21600	0.9183	0.6864	0.2806
0.2706	21.0	22680	0.9366	0.6812	0.2860
0.2507	22.0	23760	0.9585	0.6941	0.2608
0.237	23.0	24840	1.0100	0.6798	0.2802
0.2298	24.0	25920	0.9185	0.6349	0.2449
0.221	25.0	27000	0.9353	0.6580	0.2785
0.2052	26.0	28080	0.8652	0.6493	0.2507
0.1928	27.0	29160	0.8859	0.6776	0.2631
0.1889	28.0	30240	0.9240	0.6637	0.2666
0.1771	29.0	31320	0.9043	0.6256	0.2493
0.163	30.0	32400	0.9131	0.6504	0.2621
0.1603	31.0	33480	0.8102	0.6319	0.2406
0.1447	32.0	34560	0.9245	0.6337	0.2448
0.1418	33.0	35640	0.9590	0.6236	0.2530
0.1415	34.0	36720	0.9275	0.6345	0.2579
0.1313	35.0	37800	0.8644	0.6280	0.2498
0.1285	36.0	38880	0.9071	0.625	0.2651
0.1204	37.0	39960	0.8658	0.6092	0.2387
0.1116	38.0	41040	0.8684	0.6267	0.2459
0.102	39.0	42120	0.9792	0.6245	0.2410
0.0966	40.0	43200	0.8881	0.6163	0.2466
0.0934	41.0	44280	0.8669	0.5971	0.2340
0.0847	42.0	45360	0.9718	0.6207	0.2371
0.0828	43.0	46440	0.9573	0.6223	0.2393
0.0727	44.0	47520	0.9872	0.6097	0.2358
0.0701	45.0	48600	0.9421	0.6116	0.2446
0.0648	46.0	49680	0.9591	0.6043	0.2467
0.0634	47.0	50760	0.9991	0.6110	0.2355
0.0573	48.0	51840	0.9873	0.6054	0.2345
0.0527	49.0	52920	0.9886	0.5936	0.2325
0.0506	50.0	54000	1.0199	0.5941	0.2287
0.0486	51.0	55080	1.0691	0.5881	0.2263
0.0447	52.0	56160	1.0141	0.5893	0.2296
0.0419	53.0	57240	1.0658	0.5873	0.2279
0.0376	54.0	58320	1.1441	0.5889	0.2254
0.0355	55.0	59400	1.1462	0.5881	0.2249
0.0335	56.0	60480	1.1712	0.5860	0.2244
0.0296	57.0	61560	1.1622	0.5786	0.2218
0.0301	58.0	62640	1.1704	0.5840	0.2235
0.0283	59.0	63720	1.1973	0.5805	0.2213
0.0245	60.0	64800	1.1908	0.5762	0.2198

Framework versions

Transformers 4.48.3
Pytorch 2.5.1
Datasets 3.2.0
Tokenizers 0.21.0

Priyanship
/

outputs

outputs

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results