Whisper Large-v2 Java - HQ TTS

This model is a fine-tuned version of openai/whisper-large-v2 on the jv_id_tts dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1792
  • Wer: 9.2409

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 10000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
0.2752 0.8587 1000 0.2629 18.0899
0.1117 1.7170 2000 0.2091 14.7071
0.0656 2.5754 3000 0.1855 12.2112
0.0294 3.4337 4000 0.1709 10.8911
0.0182 4.2920 5000 0.1662 10.4992
0.01 5.1503 6000 0.1709 10.0660
0.0084 6.0086 7000 0.1681 9.6328
0.0057 6.8673 8000 0.1689 9.0965
0.0019 7.7256 9000 0.1780 9.2409
0.0005 8.5839 10000 0.1792 9.2409

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 2.18.0
  • Tokenizers 0.21.1
Downloads last month
3
Safetensors
Model size
2B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bagasshw/whisper-large-v2

Finetuned
(225)
this model

Dataset used to train bagasshw/whisper-large-v2

Evaluation results