--- license: apache-2.0 datasets: - ivrit-ai/crowd-transcribe-v5 language: - he base_model: - openai/whisper-large-v3-turbo --- This is ivrit.ai's faster-whisper model, based on the ivrit-ai/whisper-large-v3-turbo Whisper model. Training data includes 295 hours of volunteer-transcribed speech from the ivrit-ai/crowd-transcribe-v5 dataset, as well as 93 hours of professional transcribed speech from other sources. Release date: TBD # Prerequisites pip3 install faster_whisper # Usage ``` import faster_whisper model = faster_whisper.WhisperModel('ivrit-ai/whisper-large-v3-turbo-ct2') segs, _ = model.transcribe('media-file', language='he') texts = [s.text for s in segs] transcribed_text = ' '.join(texts) print(f'Transcribed text: {transcribed_text}') ```