metadata
library_name: transformers
language:
- sq
license: mit
base_model: openai/whisper-large-v3-turbo
datasets:
- Kushtrim/audioshqip-200h
metrics:
- wer
model-index:
- name: Whisper Large v3 Turbo Shqip
results:
- task:
type: automatic-speech-recognition
name: Automatic Speech Recognition
dataset:
name: Audio Shqip 200 orë
type: Kushtrim/audioshqip-200h
args: 'config: sq, split: test'
metrics:
- type: wer
value: 19.891368436098556
name: Wer
Whisper Large V3 Turbo Shqip
This model is a fine-tuned version of openai/whisper-large-v3-turbo specifically for the Albanian language, including the Gheg dialect. It was trained on a meticulously curated dataset comprising 200 hours of high-quality Albanian audio.
Key Features
- Language Coverage: Supports standard Albanian as well as the Gheg dialect, ensuring robust transcription performance across regional variations.
- Dataset: Fine-tuned on 200 hours of diverse and well-annotated Albanian audio data, capturing a wide range of accents, speech contexts, and domains.
This model is optimized for automatic speech recognition (ASR) tasks in Albanian and can be used in applications such as transcription, subtitling, and real-time speech processing.