Whisper Small fine-tuned for Kannada
This is a Whisper Small model fine-tuned for Kannada Language on ~300 hrs of labeled dataset.
Performance
- Test WER: 29.63%
- Test CER: 7.12%
- Test WER WITH NORMALIZATION: 23.61%
- Test CER WITH NORMALIZATION: 6.21%
Usage
#!pip install whisper_transcriber
from whisper_transcriber import WhisperTranscriber
# Initialize the transcriber
transcriber = WhisperTranscriber(model_name="coild/whisper_small_kannada")
# Transcribe an audio file with automatic transcript printing
results = transcriber.transcribe(
"audio_file.mp3",
min_segment=5,
max_segment=15,
silence_duration=0.2,
sample_rate=16000,
batch_size=4,
normalize=True,
normalize_text=True,
verbose=False
)
# Access the transcription results manually
for i, segment in enumerate(results):
print(f"\n[{segment['start']} --> {segment['end']}]")
print(f"{segment['transcript']}")
Model Details
Model Description
- Developed by: Ranjan Shettigar
- Language(s) (NLP): kn
- Finetuned from model [OpenAI]: whipser-small
- Repository: [More Information Needed]
- Paper [optional]: [More Information Needed]
- Demo [optional]: [More Information Needed]
Training Details
Training and evaluation data
Training Data:
Evaluation Data:
Training Procedure
Preprocessing [optional]
[More Information Needed]
Training Hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 16
- eval_batch_size: 16
- optimizer: adamw
- epochs: 8
BibTeX:
[More Information Needed]
- Downloads last month
- 108
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for coild/whisper_small_kannada
Base model
openai/whisper-smallEvaluation results
- WER on google/fleurstest set self-reported29.630
- CER on google/fleurstest set self-reported7.120
- WER WITH NORMALIZATION on google/fleurstest set self-reported23.610
- CER WITH NORMALIZATION on google/fleurstest set self-reported6.210