whisper-large-v3-turbo-ami-disfluent-full

This model is a fine-tuned version of openai/whisper-large-v3-turbo on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 8
eval_batch_size: 32
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 32
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
training_steps: 5000

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
No log	0	0	2.8591	23.2283	14.9979
0.3457	0.1	500	0.2787	9.8557	4.9508
0.252	1.0748	1000	0.2785	10.9926	5.6876
0.1053	2.0496	1500	0.2708	9.1643	4.5877
0.0505	3.0244	2000	0.3046	9.9821	5.4330
0.0544	3.1244	2500	0.2819	8.8718	4.4522
0.0209	4.0992	3000	0.3062	9.5699	5.1405
0.0111	5.074	3500	0.3224	8.5394	4.4230
0.0023	6.0488	4000	0.3427	8.4131	4.3766
0.0018	7.0236	4500	0.3489	8.3932	4.3516
0.0018	7.1236	5000	0.3507	8.4297	4.3573