Shunya Labs Hinglish ASR Model
The only Hinglish code-switch STT model that generates transcripts in mixed tokens.
Model Details
Model Description
This is the first speech recognition model designed natively for Hinglish—the natural mix of Hindi and English commonly spoken across India. Unlike conventional approaches that force transcription into a single language, this model generates mixed-language tokens directly, preserving how people actually speak.
Base Model: OpenAI Whisper Medium Post-trained by: Shunya Labs Language: Hinglish (Hindi-English code-switching)
Why This Model?
Standard ASR models treat Hindi and English as separate languages, forcing transcription into one or the other. This creates errors when speakers naturally switch between languages mid-sentence—which is how millions of people actually talk. This model was trained specifically on code-switched speech, so it:
- Transcribes Hindi and English tokens as they naturally occur
- Handles mid-sentence language switches accurately
- Produces faster inference by avoiding language detection overhead
- Delivers higher accuracy on real-world Hinglish speech
Demo
- Try the model at: https://huggingface.co/spaces/shunyalabs/Zero_STT_Hinglish_Shunya_Labs
Use Cases
- Transcription of Hinglish conversations, podcasts, and videos
- Voice assistants serving Indian users
- Meeting transcription for Indian workplaces
- Content creation and subtitling
How to Get Started with the Model
Use the code below to get started with the model.
from transformers import pipeline
transcriber = pipeline("automatic-speech-recognition", model="shunya-labs/hinglish-whisper-medium")
result = transcriber("audio.mp3")
print(result["text"])
Training Details
Training Data
Openai/whisper-medium post-trained on Google Vaani as well as proprietary datasets.
- Downloads last month
- 70
Model tree for shunyalabs/zero-stt-hinglish
Base model
openai/whisper-medium