mms-1b-all-swagen-female-15hrs-42

This model is a fine-tuned version of facebook/mms-1b-all on the SWAGEN - SWA dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 5.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
7.959	0.1572	100	4.5467	1.0
4.1239	0.3145	200	3.8855	1.0
3.2097	0.4717	300	2.8369	0.9861
0.6263	0.6289	400	0.2378	0.1940
0.2794	0.7862	500	0.2318	0.1907
0.2517	0.9434	600	0.2285	0.1898
0.2385	1.1006	700	0.2243	0.1888
0.2469	1.2579	800	0.2187	0.1888
0.2474	1.4151	900	0.2183	0.1861
0.2379	1.5723	1000	0.2135	0.1884
0.2345	1.7296	1100	0.2091	0.1859
0.2262	1.8868	1200	0.2060	0.1855
0.2318	2.0440	1300	0.2053	0.1830
0.2244	2.2013	1400	0.2051	0.1861
0.2222	2.3585	1500	0.2031	0.1844
0.2264	2.5157	1600	0.2042	0.1853
0.2121	2.6730	1700	0.2001	0.1838
0.2234	2.8302	1800	0.2032	0.1853
0.2315	2.9874	1900	0.2033	0.1840
0.2135	3.1447	2000	0.2003	0.1838
0.2154	3.3019	2100	0.2023	0.1836