base_sami_22k_ftpseudo

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 272.0352
Wer: 0.4500
Cer: 0.1412

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0005
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.25
num_epochs: 60.0
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
3988.287	1.0	1080	303.8312	0.5742	0.1722
674.1274	2.0	2160	270.2952	0.4506	0.1415
517.9397	3.0	3240	339.8132	0.4931	0.1630
478.4345	4.0	4320	292.6837	0.4896	0.1691
476.3436	5.0	5400	331.5307	0.5095	0.1799
507.3865	6.0	6480	359.2405	0.5620	0.2006
511.676	7.0	7560	406.4503	0.5881	0.2164
546.8947	8.0	8640	367.2822	0.5835	0.2145
578.7039	9.0	9720	441.9370	0.6559	0.2586
610.5788	10.0	10800	454.6888	0.6856	0.2609
658.402	11.0	11880	472.7102	0.7457	0.3024
692.8005	12.0	12960	474.2548	0.6993	0.2891
726.6378	13.0	14040	476.5723	0.7154	0.2942
775.3879	14.0	15120	469.1171	0.7360	0.2998
803.9686	15.0	16200	503.4136	0.7707	0.3031
818.0579	16.0	17280	544.9781	0.7587	0.3132
808.2149	17.0	18360	493.0830	0.7396	0.3015
767.5317	18.0	19440	527.8341	0.7296	0.3046
739.8194	19.0	20520	500.3179	0.7558	0.3085
716.691	20.0	21600	545.5074	0.7235	0.2984
682.661	21.0	22680	516.1239	0.7511	0.2900
657.0491	22.0	23760	549.7004	0.6968	0.2776
629.1355	23.0	24840	500.3793	0.6974	0.2808
607.2812	24.0	25920	528.1496	0.6959	0.2700
595.4605	25.0	27000	495.3539	0.7015	0.2834
555.9978	26.0	28080	500.2841	0.7071	0.2782
544.9409	27.0	29160	476.8067	0.7075	0.2840
517.4491	28.0	30240	513.6489	0.6824	0.2703
502.3091	29.0	31320	450.8210	0.6880	0.2624
477.324	30.0	32400	469.6162	0.6562	0.2616
461.2854	31.0	33480	480.2810	0.6640	0.2484
452.682	32.0	34560	477.9762	0.6638	0.2652
424.353	33.0	35640	444.6511	0.6533	0.2520
417.6179	34.0	36720	412.5329	0.6504	0.2526
389.705	35.0	37800	485.3770	0.6744	0.2633
375.7767	36.0	38880	467.3829	0.6474	0.2664
361.8829	37.0	39960	469.9674	0.6312	0.2517
352.311	38.0	41040	457.0285	0.6495	0.2545
340.1846	39.0	42120	463.1925	0.6345	0.2462
323.3272	40.0	43200	421.0725	0.6171	0.2394
312.6201	41.0	44280	443.3647	0.6201	0.2384
301.6251	42.0	45360	429.3776	0.6105	0.2350
284.7902	43.0	46440	466.1553	0.6021	0.2321
279.8459	44.0	47520	487.2148	0.6162	0.2319
260.5616	45.0	48600	445.4757	0.6023	0.2306
254.3347	46.0	49680	439.6965	0.6054	0.2392
244.043	47.0	50760	459.5868	0.5885	0.2317
227.4755	48.0	51840	492.8037	0.6002	0.2308
216.7	49.0	52920	452.6693	0.5934	0.2283
211.8976	50.0	54000	482.3886	0.5947	0.2288
202.0287	51.0	55080	475.8258	0.6053	0.2353
186.2731	52.0	56160	465.3925	0.5908	0.2311
187.1888	53.0	57240	459.6522	0.5890	0.2247
179.0453	54.0	58320	473.7304	0.5789	0.2243
165.2614	55.0	59400	453.9692	0.5788	0.2238
160.4416	56.0	60480	474.8051	0.5732	0.2212
153.8781	57.0	61560	478.4581	0.5729	0.2202
151.1706	58.0	62640	467.0158	0.5688	0.2196
147.0876	59.0	63720	474.2252	0.5603	0.2160
143.0797	60.0	64800	469.5599	0.5641	0.2168

Framework versions

Transformers 4.48.3
Pytorch 2.5.1
Datasets 3.2.0
Tokenizers 0.21.0

Priyanship
/

base_sami_22k_ftpseudo

base_sami_22k_ftpseudo

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results