xlsr-a-no

This model is a fine-tuned version of facebook/wav2vec2-large-xlsr-53 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0004
train_batch_size: 8
eval_batch_size: 8
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 16
optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 132
num_epochs: 100
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss	Wer
4.2929	2.8633	200	2.3482	1.0
0.9068	5.7194	400	0.2885	0.5495
0.1691	8.5755	600	0.2612	0.3732
0.0828	11.4317	800	0.2968	0.3595
0.0473	14.2878	1000	0.3520	0.3515
0.0428	17.1439	1200	0.3341	0.3595
0.0358	20.0	1400	0.3301	0.3345
0.0267	22.8633	1600	0.3461	0.3242
0.0247	25.7194	1800	0.3533	0.3333
0.0219	28.5755	2000	0.3746	0.3242
0.0245	31.4317	2200	0.3684	0.3254
0.0151	34.2878	2400	0.3780	0.3197
0.0119	37.1439	2600	0.3322	0.3220
0.0151	40.0	2800	0.3688	0.3231
0.0131	42.8633	3000	0.3895	0.3231
0.0118	45.7194	3200	0.4120	0.3242
0.0079	48.5755	3400	0.4206	0.3231
0.0062	51.4317	3600	0.4137	0.3242
0.0059	54.2878	3800	0.3937	0.3220
0.0085	57.1439	4000	0.4004	0.3208
0.0068	60.0	4200	0.4266	0.3242
0.0065	62.8633	4400	0.3980	0.3208
0.0067	65.7194	4600	0.3957	0.3208
0.0051	68.5755	4800	0.4251	0.3220
0.0035	71.4317	5000	0.4428	0.3220
0.0047	74.2878	5200	0.4380	0.3231
0.0034	77.1439	5400	0.4388	0.3220
0.0033	80.0	5600	0.4544	0.3197
0.002	82.8633	5800	0.4368	0.3197
0.0042	85.7194	6000	0.4402	0.3197
0.0028	88.5755	6200	0.4482	0.3220
0.002	91.4317	6400	0.4547	0.3208
0.0012	94.2878	6600	0.4565	0.3208
0.0017	97.1439	6800	0.4542	0.3208