mms-1b-all-bigcgen-male-5hrs-42

This model is a fine-tuned version of facebook/mms-1b-all on the BIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
8.1401	0.6211	100	4.4180	1.0094
4.1168	1.2422	200	3.8054	1.0019
3.7512	1.8634	300	3.5376	1.0002
3.3903	2.4845	400	3.1647	1.0019
3.1996	3.1056	500	2.9068	0.9995
2.9987	3.7267	600	2.9281	0.9995
2.9207	4.3478	700	2.6511	1.0
1.6925	4.9689	800	0.6456	0.6278
0.7698	5.5901	900	0.5464	0.5850
0.6804	6.2112	1000	0.5268	0.5374
0.6414	6.8323	1100	0.4935	0.5256
0.5936	7.4534	1200	0.4801	0.5061
0.6344	8.0745	1300	0.4744	0.5028
0.5952	8.6957	1400	0.4729	0.4963
0.5922	9.3168	1500	0.4647	0.4850
0.5804	9.9379	1600	0.4512	0.4809
0.5836	10.5590	1700	0.4538	0.4749
0.5617	11.1801	1800	0.4510	0.4730
0.5602	11.8012	1900	0.4526	0.4744
0.556	12.4224	2000	0.4452	0.4725
0.5725	13.0435	2100	0.4523	0.4722
0.5615	13.6646	2200	0.4438	0.4655
0.5542	14.2857	2300	0.4423	0.4684
0.5278	14.9068	2400	0.4457	0.4684
0.5314	15.5280	2500	0.4389	0.4604
0.5609	16.1491	2600	0.4421	0.4597
0.5383	16.7702	2700	0.4401	0.4520
0.5271	17.3913	2800	0.4407	0.4631
0.527	18.0124	2900	0.4401	0.4467