Install Requirements
Clone the ADI-20 github repository:
git clone https://github.com/elyadata/ADI-20
cd ADI-20
pip install -r requirements.txt
Note on SpeechBrain
While you can use the pipy version of Speechbrain included in the requirements.txt in the ADI-20 github repository, you may also install it from source using the following command:
pip install git+https://github.com/speechbrain/speechbrain.git@develop
Perform Arabic Dialect Identification
from inference.classifier_attention_pooling import WhisperDialectClassifier
dialect_id = WhisperDialectClassifier.from_hparams(
source="Elyadata/ADI-whisper-ADI17",
hparams_file="hyperparams.yaml",
savedir="pretrained_DID",
run_opts={"device": "cuda"} # If using a GPU (recommended).
)
out_prob, score, index, text_lab = dialect_id.classify_file("your_file.wav")
print(f"Predicted dialect: {text_lab[0]}")
print("-" * 15)
print(f"Dialect index: {index}")
print(f"Score: {score}")
print(f"Output log probs: {out_prob}")
print("-" * 15)
NADI 2025
We have also used the ADI-20 version of this model for dialect identification task in the NADI 2025 challenge and ranked first:
| RANK | Codabench Username | Accuracy | Cost |
|---|---|---|---|
| 🥇 | harounelleuch (this model) | 0.7983 | 0.1788 |
| 🥈 | badr_alabsi | 0.7640 | 0.2265 |
| 🥉 | rafiulbiswas | 0.616 | 0.3068 |
| 4 | gahmed92 | 0.612 | 0.3477 |
| 5 | ADI Baseline | 0.6109 | 0.3422 |
For more information on how we used the model, you can refer to:
- Our system paper: arXiv, ACL Anthology
- NADI findings paper: arXiv, ACL Anthology
Citations
If using this work, please cite:
@inproceedings{elleuch25_interspeech,
title = {{ADI-20: Arabic Dialect Identification dataset and models}},
author = {Haroun Elleuch and Salima Mdhaffar and Yannick Estève and Fethi Bougares},
year = {2025},
booktitle = {{Interspeech 2025}},
pages = {2775--2779},
doi = {10.21437/Interspeech.2025-884},
issn = {2958-1796},
}
@inproceedings{elleuch-etal-2025-elyadata,
title = "{ELYADATA} {\&} {LIA} at {NADI} 2025: {ASR} and {ADI} Subtasks",
author = "Elleuch, Haroun and
Saidi, Youssef and
Mdhaffar, Salima and
Est{\`e}ve, Yannick and
Bougares, Fethi",
booktitle = "Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks",
month = nov,
year = "2025",
address = "Suzhou, China",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2025.arabicnlp-sharedtasks.105/",
doi = "10.18653/v1/2025.arabicnlp-sharedtasks.105",
pages = "762--766",
ISBN = "979-8-89176-356-2",
}
- Downloads last month
- 7