Configuration Parsing Warning: In adapter_config.json: "peft.task_type" must be a string

Whisper Large V3 Fine-tuned for Egyptian Arabic

This model is a fine-tuned version of openai/whisper-large-v3 on the Egyptian-ASR-MGB-3 dataset.

Model Description

This model has been fine-tuned using LoRA (Low-Rank Adaptation) to improve automatic speech recognition performance on Egyptian Arabic dialect.

Training Details

Base Model: openai/whisper-large-v3
Fine-tuning Method: LoRA (Low-Rank Adaptation)
Dataset: Egyptian-ASR-MGB-3
Language: Egyptian Arabic
Training Steps: 100
Batch Size: 1 (with gradient accumulation steps: 8)
Learning Rate: 1e-4

LoRA Configuration

Rank (r): 8
Alpha: 32
Target Modules: ["q_proj", "v_proj"]
Dropout: 0.1

Performance

Word Error Rate (WER): 0.4739

Usage

import torch
from transformers import WhisperProcessor, AutoModelForSpeechSeq2Seq
from peft import PeftModel
import librosa

# Load the model and processor
processor = WhisperProcessor.from_pretrained("AbdelrahmanHassan/whisper-large-v3-egyptian-arabic")
model = AutoModelForSpeechSeq2Seq.from_pretrained(
    "openai/whisper-large-v3",
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    use_safetensors=True
)

# Load the LoRA adapter
model = PeftModel.from_pretrained(model, "AbdelrahmanHassan/whisper-large-v3-egyptian-arabic")

# Load and process audio
audio, sr = librosa.load("path_to_audio.wav", sr=16000)
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(input_features, max_length=225)
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)[0]

print(transcription)

Training Procedure

Training Data

The model was trained on the Egyptian-ASR-MGB-3 dataset, which contains Egyptian Arabic speech samples.

Training Hyperparameters

Learning Rate: 1e-4
Training Steps: 100
Warmup Steps: 5
Per Device Train Batch Size: 1
Gradient Accumulation Steps: 8
Generation Max Length: 225
FP16/BF16: Automatic detection based on hardware

Framework Versions

Transformers: Latest
Pytorch: Latest
PEFT: Latest
Datasets: Latest

Citation

If you use this model, please cite:

@misc{whisper-egyptian-arabic,
  title={Whisper Large V3 Fine-tuned for Egyptian Arabic},
  author={Your Name},
  year={2025},
  howpublished={\url{https://huggingface.co/AbdelrahmanHassan/whisper-large-v3-egyptian-arabic}}
}

Limitations and Bias

This model is specifically fine-tuned for Egyptian Arabic dialect and may not perform well on other Arabic dialects or languages. The performance is dependent on the quality and diversity of the training data.

Downloads last month: 201

Model tree for AbdelrahmanHassan/whisper-large-v3-egyptian-arabic

Base model

openai/whisper-large-v3

Adapter

(133)

this model

Dataset used to train AbdelrahmanHassan/whisper-large-v3-egyptian-arabic

Evaluation results

Word Error Rate on Egyptian-ASR-MGB-3
self-reported

0.474

View on Papers With Code