Whisper Medium IT

This model is a fine-tuned version of miosipof/asr2_aug_IT_v4_merged on the b-brave-balanced-augmented dataset. It achieves the following results on the evaluation set:

Loss: 0.0005
Wer: 0.0
Cer: 0.0
Lr: 0.0000

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 32
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 128
optimizer: Use adamw_torch_4bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
lr_scheduler_warmup_ratio: 0.3
num_epochs: 10
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	Wer	Cer
1.1582	1.0	68	0.1790	24.6508	16.4515
0.9748	2.0	136	0.1054	15.6943	10.5674
0.5207	3.0	204	0.0574	10.3533	7.6854
0.3516	4.0	272	0.0296	19.9671	18.2828
0.2355	5.0	340	0.0113	2.3007	1.6361
0.1018	6.0	408	0.0056	0.3287	0.2252
0.0639	7.0	476	0.0037	0.1643	0.0600
0.0443	8.0	544	0.0033	0.4108	0.2252
0.015	9.0	612	0.0008	0.0	0.0
0.0095	9.8625	670	0.0005	0.0	0.0

Framework versions

Transformers 4.47.1
Pytorch 2.6.0+cu124
Datasets 3.5.0
Tokenizers 0.21.1

miosipof
/

asr2_aug_IT_v4_full

Whisper Medium IT

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for miosipof/asr2_aug_IT_v4_full

Evaluation results