Sanskrit Whisper ASR

Fine-tuned Whisper model for Sanskrit speech recognition.

Usage

import librosa

processor = WhisperProcessor.from_pretrained('rverma251/sanskrit-whisper-asr')
model = WhisperForConditionalGeneration.from_pretrained('rverma251/sanskrit-whisper-asr')

model.generation_config.language = 'hindi'
model.generation_config.task = 'transcribe'

audio, sr = librosa.load('sanskrit_audio.wav', sr=16000)
inputs = processor.feature_extractor(audio, sampling_rate=16000, return_tensors='pt')
predicted_ids = model.generate(inputs.input_features)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
print(transcription)

Model Details

  • Base Model: openai/whisper-small
  • Training Data: 21,385 Sanskrit samples from AI4Bharat Rasa
  • Languages: Sanskrit (sa)
  • Task: Automatic Speech Recognition
Downloads last month
21
Safetensors
Model size
242M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using rverma251/sanskrit-whisper-asr 1