---
language: yo
license: other
library_name: transformers
pipeline_tag: automatic-speech-recognition
tags:
- automatic-speech-recognition
- yoruba
- whisper small
- nigeria
base_model: openai/whisper-small
---

# Yoruba ASR - Whisper Small

A state-of-the-art automatic speech recognition (ASR) model for the Yoruba language, fine-tuned on the Whisper Small architecture. This model was powered by **Awarri Technologies** and an initiative of the **Federal Ministry of Communications, Innovation and Digital Economy** to advance indigenous language technologies and promote digital inclusion.

## Model Description

- **Model Name:** Yoruba-ASR-v1.0
- **Architecture:** Whisper Small (244M parameters)
- **Language:** Yoruba (yo)
- **License:** [other]
- **Model Size:** ~244M parameters


## Training Data

- **Training Time:** 120 hours
- **Data Sources:**
  - Langeasy platform recordings from speakers across Nigeria's 6 geopolitical zones
  - Publicly available datasets
- **Geographic Coverage:** All 6 geopolitical zones of Nigeria

## Quick Start

### Installation

```bash
pip install torch torchaudio transformers librosa
```

### Basic Usage

```python
from transformers import pipeline
import librosa

# Initialize the ASR pipeline
asr = pipeline("automatic-speech-recognition", model="NCAIR1/Yoruba-ASR")

# Load audio file (16kHz recommended)
audio, sr = librosa.load("your_yoruba_audio.wav", sr=16000)

# Transcribe
result = asr(audio)
print(result["text"])
```

### Advanced Usage

```python
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa

# Load model and processor
processor = WhisperProcessor.from_pretrained("NCAIR1/Yoruba-ASR")
model = WhisperForConditionalGeneration.from_pretrained("NCAIR1/Yoruba-ASR")

# Process audio
audio, sr = librosa.load("audio_file.wav", sr=16000)
input_features = processor(audio, sampling_rate=sr, return_tensors="pt").input_features

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(input_features)
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
    
print(transcription[0])
```

## Use Cases

### ✅ Recommended Applications
- Academic research on Yoruba linguistics
- Educational tools for Yoruba language learning
- Accessibility applications for hearing-impaired Yoruba speakers
- Cultural preservation and documentation
- Voice-enabled applications for Yoruba speakers
- Media and broadcast transcription
- Government services requiring Yoruba language support

### ❌ Not Recommended
- Mass surveillance or unauthorized monitoring
- High-stakes applications without human oversight (legal, medical)
- Applications that could discriminate based on dialectal variations

## Limitations

- **Dialectal Coverage:** Some regional dialects may have varying accuracy levels
- **Code-Switching:** Reduced performance when mixing Yoruba with English
- **Audio Quality:** Performance degrades with poor audio quality or excessive noise
- **Children's Speech:** Limited training data for younger speakers
- **Domain-Specific Content:** May require fine-tuning for specialized domains

## Model Details

### Technical Specifications
- **Architecture:** Transformer-based (Whisper Small)
- **Parameters:** 244M
- **Input:** Audio waveform (16kHz recommended)
- **Output:** Yoruba text transcription
- **Context Length:** 30 seconds maximum per inference

### Training Details
- **Base Model:** OpenAI Whisper Small
- **Fine-tuning Data:** 627.09 hours of Yoruba speech
- **Geographic Representation:** All 6 geopolitical zones of Nigeria
- **Training Framework:** [PyTorch with Hugging Face Transformers]

## Fine-tuning

For domain-specific applications, this model can be further fine-tuned:

```python
from transformers import WhisperForConditionalGeneration, Seq2SeqTrainer

# Load base model
model = WhisperForConditionalGeneration.from_pretrained("NCAIR1/Yoruba-ASR")

# Fine-tune with your domain-specific Yoruba data
# Recommended: 10-20 hours of high-quality domain audio
```

## Ethical Considerations

- Designed to promote equitable access to speech technology for Yoruba speakers
- Users should consider cultural sensitivity when deploying the model
- Continuous monitoring for bias and performance variations is recommended
- Should not be used for surveillance or applications that could harm individuals

## Citation

```bibtex
@misc{awarri2025yoruba,
  title={Yoruba-ASR-v1.0: Automatic Speech Recognition for Yoruba Language},
  author={Awarri Technologies and National Information Technology Innoation and Development},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/NCAIR-NG/Yoruba}
}
```

## Contact & Support
- **Initiative Of:** The Federal Ministry of Communications, Innovation and Digital Economy
- **Powered By:** Awarri Technologies
- **Project:** N-ATLaS
- **Version:** 1.0 (September 2025)

For issues, questions, or collaboration opportunities, please refer to the model repository discussions or contact Awarri Technologies.

## Acknowledgments

This work was made possible through:
- AWARRI Technologies
- National Information Technology Development Agency (NITDA)
- The Federal Ministry of Communications, Innovation and Digital Economy
- National Center for Artificial Intelligence and Robotics
- Data contributors from across Nigeria's 6 geopolitical zones via the Langeasy platform
- The broader Nigerian language technology research community

# Terms of Use for *Yoruba-ASR*  

**Effective Date:** September 2025  
**Version:** 1.0  

---

## 1. Introduction & Scope  
Awarri Technologies, in partnership with the Federal Government of Nigeria, hereby releases **Yoruba-ASR**, an Automatic Speech Recognition (ASR) model for the Yoruba language, based on the Whisper Small architecture.   

Yoruba-ASR is released under an Open-Source Research and Innovation License inspired by permissive licenses such as Apache 2.0 and MIT, but with additional restrictions tailored for responsible use in Nigeria and globally.  

The model is intended to support:  

- Research and academic study  
- Education and capacity development  
- Civic technology and accessibility initiatives  
- Linguistic and cultural preservation, and community projects  

⚠️ Yoruba-ASR is *not* an enterprise-grade or commercial system. Commercial or large-scale enterprise use requires a separate licensing agreement (see Section 3).  

---

## 2. License Grant  
Subject to compliance with these Terms, users are granted a worldwide, royalty-free, non-exclusive, non-transferable license to:  

- Download, use, and run Yoruba-ASR for permitted purposes  
- Modify, adapt, and create derivative works of Yoruba-ASR  
- Redistribute Yoruba-ASR and derivative works under these same Terms  

**Conditions:**  
1. Attribution must be given to:  
   > “Awarri Technologies and the Federal Government of Nigeria, developers of N-ATLaS (Yoruba-ASR).”  

2. Derivative works must be released under the same license, ensuring consistency and traceability.  
3. If Yoruba-ASR or its derivatives are renamed, they must carry the suffix: **“Powered by Awarri.”**  

---

## 3. User License Cap (1000 Users)  
Use of Yoruba-ASR is limited to organizations, institutions, or projects with no more than **1000 active end-users**.  

- An *active end-user* is an individual who directly interacts with Yoruba-ASR outputs (e.g., via an app, website, or integrated service) within a rolling 30-day period.  
- Organizations exceeding the 1000-user cap must obtain a **commercial license** directly from Awarri Technologies in partnership with the Federal Ministry of Communications, Innovation, and Digital Economy.  

---

## 4. Acceptable Use  

### ✅ Permitted Use Cases include (but are not limited to):  
- Academic and non-profit research  
- Accessibility for persons with disabilities  
- Language and cultural preservation projects  
- Civic technology and public benefit applications  
- Education, training, and community innovation  

### ❌ Prohibited Use Cases include (but are not limited to):  
- Surveillance or unlawful monitoring  
- Discriminatory profiling or exclusionary practices  
- Disinformation, impersonation, or synthetic fraud  
- Military, intelligence, or weaponized deployment  
- Exploitative, harmful, or unlawful applications  

---

## 5. Limitations & Disclaimer  
- Yoruba-ASR is released **“as-is”**, without warranties of any kind, express or implied.  

**Known limitations include:**  
- Dialectal coverage: Some regional dialects of Yoruba may have varying accuracy levels  
- Code-switching: Reduced performance when mixing Yoruba with English or other languages  
- Audio quality: Performance degrades with poor audio or excessive background noise  
- Children’s speech: Limited training data for younger speakers  

Neither Awarri Technologies nor the Federal Government of Nigeria shall be liable for damages arising from the use of Yoruba-ASR.  

---

## 6. Ethical & Cultural Considerations  
Users must:  

- Respect Nigeria’s cultural and linguistic diversity, particularly within Yoruba speaking communities  
- Ensure transparent reporting of accuracy, bias, and limitations  
- Uphold human rights and privacy standards in all deployments  

---

## 7. Data & Privacy  
- All training data used in Yoruba-ASR was either publicly available or government-approved for use.  
- Users are strictly prohibited from using Yoruba-ASR for unauthorized personal data scraping, collection, or profiling.  

---

## 8. Governance & Updates  
- Governance and oversight are led by the Federal Ministry of Communications, Innovation, and Digital Economy, in collaboration with the National Centre for Artificial Intelligence & Robotics (NCAIR).  
- Awarri Technologies shall act as the technical maintainer and custodian of Yoruba-ASR.  
- Updates, improvements, and community contributions will be published periodically.  
- Users must comply with the specific Terms attached to each version release.  

---

## 9. Legal & Jurisdiction  
- These Terms are governed by the laws of the Federal Republic of Nigeria.  
- In the event of a dispute, parties agree to seek resolution first through mediation under the auspices of the Federal Ministry of Justice, before pursuing litigation in Nigerian courts.  

---

## 10. Termination  
The Federal Government of Nigeria and Awarri Technologies reserve the right to revoke, suspend, or terminate usage rights if these Terms are violated.  

Termination may apply to individual users, institutions, or organizations found in breach.  

---

## 11. Contact & Attribution  

For licensing, inquiries, and commercial partnerships regarding Yoruba-ASR, contact:  

**Awarri Technologies**  
- Email: [datasupport@awarri.com](mailto:datasupport@awarri.com)  
- Website: [awarri.com](https://awarri.com)  

**Federal Ministry of Communications, Innovation, and Digital Economy**  
- Email: [ncair@nitda.gov.ng](mailto:ncair@nitda.gov.ng)  
- Website: [NCAIR](https://ncair.nitda.gov.ng/)  

**Required attribution in all public use:**  
> “Yoruba-ASR is developed by Awarri Technologies in partnership with the Federal Government of Nigeria.”  

If renamed, the model must carry the suffix:  
> **“Powered by Awarri.”**  


---

*This model contributes to digital inclusion, cultural preservation, and the advancement of indigenous language technologies in Nigeria.*