mms-1b-all-bigcgen-male-5hrs-52

This model is a fine-tuned version of facebook/mms-1b-all on the BIGCGEN - BEM dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0003
train_batch_size: 8
eval_batch_size: 4
seed: 52
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 100
num_epochs: 30.0
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
8.9818	0.6211	100	4.9435	1.0139
4.8534	1.2422	200	4.2897	1.0
4.1334	1.8634	300	2.5552	1.0
1.0089	2.4845	400	0.6157	0.5893
0.7657	3.1056	500	0.5726	0.5458
0.6709	3.7267	600	0.5124	0.5251
0.6495	4.3478	700	0.4878	0.4989
0.633	4.9689	800	0.4747	0.4965
0.5976	5.5901	900	0.4695	0.4900
0.6031	6.2112	1000	0.4598	0.4910
0.6004	6.8323	1100	0.4564	0.4775
0.575	7.4534	1200	0.4562	0.4689
0.5721	8.0745	1300	0.4521	0.4684
0.5812	8.6957	1400	0.4502	0.4645
0.5445	9.3168	1500	0.4557	0.4650
0.5552	9.9379	1600	0.4427	0.4619
0.5403	10.5590	1700	0.4443	0.4556
0.5677	11.1801	1800	0.4502	0.4568
0.5519	11.8012	1900	0.4409	0.4542
0.5404	12.4224	2000	0.4392	0.4516
0.537	13.0435	2100	0.4394	0.4547
0.5411	13.6646	2200	0.4416	0.4604
0.5292	14.2857	2300	0.4355	0.4487
0.5381	14.9068	2400	0.4357	0.4451
0.5267	15.5280	2500	0.4341	0.4475
0.5212	16.1491	2600	0.4381	0.4484
0.5311	16.7702	2700	0.4402	0.4470
0.4983	17.3913	2800	0.4355	0.4484
0.5205	18.0124	2900	0.4356	0.4400