Configuration Parsing
Warning:
In adapter_config.json: "peft.task_type" must be a string
Whisper Large V3 Fine-tuned for Egyptian Arabic
This model is a fine-tuned version of openai/whisper-large-v3 on the Egyptian-ASR-MGB-3 dataset.
Model Description
This model has been fine-tuned using LoRA (Low-Rank Adaptation) to improve automatic speech recognition performance on Egyptian Arabic dialect.
Training Details
- Base Model: openai/whisper-large-v3
- Fine-tuning Method: LoRA (Low-Rank Adaptation)
- Dataset: Egyptian-ASR-MGB-3
- Language: Egyptian Arabic
- Training Steps: 100
- Batch Size: 1 (with gradient accumulation steps: 8)
- Learning Rate: 1e-4
LoRA Configuration
- Rank (r): 8
- Alpha: 32
- Target Modules: ["q_proj", "v_proj"]
- Dropout: 0.1
Performance
- Word Error Rate (WER): 0.4739
Usage
import torch
from transformers import WhisperProcessor, AutoModelForSpeechSeq2Seq
from peft import PeftModel
import librosa
# Load the model and processor
processor = WhisperProcessor.from_pretrained("AbdelrahmanHassan/whisper-large-v3-egyptian-arabic")
model = AutoModelForSpeechSeq2Seq.from_pretrained(
"openai/whisper-large-v3",
torch_dtype=torch.float16,
low_cpu_mem_usage=True,
use_safetensors=True
)
# Load the LoRA adapter
model = PeftModel.from_pretrained(model, "AbdelrahmanHassan/whisper-large-v3-egyptian-arabic")
# Load and process audio
audio, sr = librosa.load("path_to_audio.wav", sr=16000)
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
# Generate transcription
with torch.no_grad():
predicted_ids = model.generate(input_features, max_length=225)
transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]
print(transcription)
Training Procedure
Training Data
The model was trained on the Egyptian-ASR-MGB-3 dataset, which contains Egyptian Arabic speech samples.
Training Hyperparameters
- Learning Rate: 1e-4
- Training Steps: 100
- Warmup Steps: 5
- Per Device Train Batch Size: 1
- Gradient Accumulation Steps: 8
- Generation Max Length: 225
- FP16/BF16: Automatic detection based on hardware
Framework Versions
- Transformers: Latest
- Pytorch: Latest
- PEFT: Latest
- Datasets: Latest
Citation
If you use this model, please cite:
@misc{whisper-egyptian-arabic,
title={Whisper Large V3 Fine-tuned for Egyptian Arabic},
author={Your Name},
year={2025},
howpublished={\url{https://huggingface.co/AbdelrahmanHassan/whisper-large-v3-egyptian-arabic}}
}
Limitations and Bias
This model is specifically fine-tuned for Egyptian Arabic dialect and may not perform well on other Arabic dialects or languages. The performance is dependent on the quality and diversity of the training data.
- Downloads last month
- 201
Model tree for AbdelrahmanHassan/whisper-large-v3-egyptian-arabic
Base model
openai/whisper-large-v3Dataset used to train AbdelrahmanHassan/whisper-large-v3-egyptian-arabic
Evaluation results
- Word Error Rate on Egyptian-ASR-MGB-3self-reported0.474