metadata
base_model: openai/whisper-large-v3
datasets:
- bn
language: bn
library_name: transformers
license: apache-2.0
model-index:
- name: Finetuned openai/whisper-large-v3 on Bengali
results:
- task:
type: automatic-speech-recognition
name: Speech-to-Text
dataset:
name: Common Voice (Bengali)
type: common_voice
metrics:
- type: wer
value: 9.651
Finetuned openai/whisper-large-v3 on 21409 Bengali training audio samples from cv-corpus-21.0-2025-03-14/bn.
This model was created from the Mozilla.ai Blueprint: speech-to-text-finetune.
Evaluation results on 9363 audio samples of Bengali:
Baseline model (before finetuning) on Bengali
- Word Error Rate (Normalized): 55.463
- Word Error Rate (Orthographic): 83.344
- Character Error Rate (Normalized): 35.66
- Character Error Rate (Orthographic): 40.754
- Loss: 0.567
Finetuned model (after finetuning) on Bengali
- Word Error Rate (Normalized): 9.651
- Word Error Rate (Orthographic): 24.288
- Character Error Rate (Normalized): 4.876
- Character Error Rate (Orthographic): 6.312
- Loss: 0.092