You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Nigerian English ASR - Whisper Small

A breakthrough automatic speech recognition (ASR) model specifically designed for Nigerian Accented English, fine-tuned on Whisper Small architecture. This model is powered by Awarri Technologies and an initiative of the Federal Ministry of Communications, Innovation and Digital Economy to bridge the accent gap in speech recognition technology and promote digital inclusion for Nigerian English speakers.

Model Description

  • Model Name: NaijaEnglish-ASR-v1.0
  • Architecture: Whisper Small (244M parameters)
  • Language: Nigerian Accented English (en-NG)
  • License: [other]
  • Model Size: ~244M parameters

Quick Start

Installation

pip install torch torchaudio transformers librosa

Basic Usage

from transformers import pipeline
import librosa

# Initialize the ASR pipeline
asr = pipeline("automatic-speech-recognition", model="NCAIR1/NigerianAccentedEnglish")

# Load audio file (16kHz recommended)
audio, sr = librosa.load("your_nigerian_english_audio.wav", sr=16000)

# Transcribe
result = asr(audio)
print(result["text"])

Advanced Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import torch
import librosa

# Load model and processor
processor = WhisperProcessor.from_pretrained("NCAIR1/NigerianAccentedEnglish")
model = WhisperForConditionalGeneration.from_pretrained("NCAIR1/NigerianAccentedEnglish")

# Process audio
audio, sr = librosa.load("nigerian_english_audio.wav", sr=16000)
input_features = processor(audio, sampling_rate=sr, return_tensors="pt").input_features

# Generate transcription
with torch.no_grad():
    predicted_ids = model.generate(input_features)
    transcription = processor.batch_decode(predicted_ids, skip_special_tokens=True)
    
print(transcription[0])

Use Cases

✅ Perfect For

  • This model can be applied across multiple domains for Nigerian speakers, where English usage follows local accents and patterns.
  • Call Centers and Customer Service – Improve support operations by handling local English effectively.
  • Educational Technology – Enable better learning tools tailored for students.
  • Media and Broadcasting – Provide accurate transcription of audio and video content.
  • Government Services – Support public-facing applications that require reliable English language processing.
  • Accessibility Applications – Assist hearing-impaired users through accurate voice-to-text solutions.
  • Academic Research – Facilitate linguistic studies on English usage and variation.
  • Voice-Enabled Applications – Build applications that recognize and respond to local voices and accents.
  • Business Applications – Power enterprise solutions in English-speaking contexts.

❌ Not Recommended

  • Mass surveillance or unauthorized monitoring
  • High-stakes applications without human oversight
  • Applications that could perpetuate accent discrimination
  • Contexts where accent bias could cause harm

Key Features

🎯 Accent-Inclusive Design

  • Trained specifically on Nigerian English speech patterns
  • Recognizes regional variations across Nigeria's 6 geopolitical zones
  • Handles Nigerian expressions and linguistic patterns

🌍 Cultural Awareness

  • Understands Nigerian English conventions
  • Respects linguistic diversity within Nigerian English
  • Promotes inclusive speech recognition technology

⚡ High Performance

  • Significant improvement over general English models
  • Optimized for real-world Nigerian speech patterns

Limitations

  • Regional Variations: Some specific regional accents may vary in accuracy
  • Code-Switching: Reduced performance when mixing with local Nigerian languages
  • Audio Quality: Performance depends on clear audio input
  • Domain-Specific Content: May require fine-tuning for specialized fields
  • Non-Nigerian Accents: Optimized specifically for Nigerian English

Model Details

Technical Specifications

  • Architecture: Transformer-based (Whisper Small)
  • Parameters: 244M
  • Input: Audio waveform (16kHz recommended)
  • Output: Nigerian English text transcription
  • Context Length: 30 seconds maximum per inference

Training Details

  • Base Model: OpenAI Whisper Small
  • Training Duration: 120 hours
  • Data Collection Platform: Langeasy
  • Data Sources:
    • Langeasy platform recordings from speakers across Nigeria's 6 geopolitical zones
    • Publicly available Nigerian English datasets
  • Geographic Coverage: All 6 geopolitical zones of Nigeria
  • Accent Diversity: Multiple regional Nigerian English variations

Fine-tuning for Your Domain

Enhance performance for specific Nigerian English applications:

from transformers import WhisperForConditionalGeneration, Seq2SeqTrainer

# Load the Nigerian English base model
model = WhisperForConditionalGeneration.from_pretrained("NCAIR1/NigerianAccentedEnglish")

# Fine-tune with your domain-specific Nigerian English data
# Recommended: 10-20 hours of high-quality domain audio

Impact & Applications

This model addresses a critical gap in speech recognition technology by providing:

  • Digital Inclusion for 200+ million Nigerian English speakers
  • Bias Reduction in voice-enabled applications
  • Cultural Preservation of Nigerian English linguistic patterns
  • Economic Opportunities through accessible speech technology

Ethical Considerations

  • Designed to combat accent bias in AI systems
  • Promotes equitable access to speech technology
  • Respects Nigerian English as a legitimate language variety
  • Should not be used for surveillance or discriminatory purposes

Citation

@misc{awarri2025nigerian,
  title={NaijaEnglish-ASR-v1.0: Accent-Inclusive Speech Recognition for Nigerian English},
  author={Awarri Technologies},
  year={2025},
  howpublished={Hugging Face Model Hub},
  url={https://huggingface.co/NCAIR-NG/NigerianAccentedEnglish}
}

Contact & Support

  • Initiative Of: The Federal Ministry of Communications, Innovation and Digital Economy
  • Powered By: Awarri Technologies
  • Project: N-ATLaS
  • Version: 1.0 (September 2025)

For issues, questions, or collaboration opportunities, please refer to the model repository discussions or contact Awarri Technologies.

Acknowledgments

This work was made possible through:

  • AWARRI Technologies
  • National Information Technology Development Agency (NITDA)
  • The Federal Ministry of Communications, Innovation and Digital Economy
  • National Center for Artificial Intelligence and Robotics
  • Data contributors from across Nigeria's 6 geopolitical zones via the Langeasy platform
  • The broader Nigerian language technology research community

Breaking barriers in speech recognition. Celebrating Nigerian English. Advancing digital inclusion.

Related Models

Terms of Use for NigerianAccentedEnglish

Effective Date: September 2025
Version: 1.0


1. Introduction & Scope

Awarri Technologies, in partnership with the Federal Government of Nigeria, hereby releases NigerianAccentedEnglish, an Automatic Speech Recognition (ASR) model for Nigerian-accented English.

NigerianAccentedEnglish is released under an Open-Source Research and Innovation License inspired by permissive licenses such as Apache 2.0 and MIT, but with additional restrictions tailored for responsible use in Nigeria and globally.

The model is intended to support:

  • Research and academic study
  • Education and capacity development
  • Civic technology and accessibility initiatives
  • Linguistic and cultural preservation, and community projects

⚠️ NigerianAccentedEnglish is not an enterprise-grade or commercial system. Commercial or large-scale enterprise use requires a separate licensing agreement (see Section 3).


2. License Grant

Subject to compliance with these Terms, users are granted a worldwide, royalty-free, non-exclusive, non-transferable license to:

  • Download, use, and run NigerianAccentedEnglish for permitted purposes
  • Modify, adapt, and create derivative works of NigerianAccentedEnglish
  • Redistribute NigerianAccentedEnglish and derivative works under these same Terms

Conditions:

  1. Attribution must be given to:

    “Awarri Technologies and the Federal Government of Nigeria, developers of N-ATLaS (NigerianAccentedEnglish).”

  2. Derivative works must be released under the same license, ensuring consistency and traceability.

  3. If NigerianAccentedEnglish or its derivatives are renamed, they must carry the suffix: “Powered by Awarri.”


3. User License Cap (1000 Users)

Use of NigerianAccentedEnglish is limited to organizations, institutions, or projects with no more than 1000 active end-users.

  • An active end-user is an individual who directly interacts with the model outputs (e.g., via an app, website, or integrated service) within a rolling 30-day period.
  • Organizations exceeding the 1000-user cap must obtain a commercial license directly from Awarri Technologies in partnership with the Federal Ministry of Communications, Innovation, and Digital Economy.

4. Acceptable Use

✅ Permitted Use Cases include (but are not limited to):

  • Academic and non-profit research
  • Accessibility for persons with disabilities
  • Language and cultural preservation projects
  • Civic technology and public benefit applications
  • Education, training, and community innovation

❌ Prohibited Use Cases include (but are not limited to):

  • Surveillance or unlawful monitoring
  • Discriminatory profiling or exclusionary practices
  • Disinformation, impersonation, or synthetic fraud
  • Military, intelligence, or weaponized deployment
  • Exploitative, harmful, or unlawful applications

5. Limitations & Disclaimer

  • NigerianAccentedEnglish is released “as-is”, without warranties of any kind, express or implied.

Known limitations include:

  • Dialectal/spoken accent variation may affect performance
  • Reduced accuracy with children’s speech
  • Limited handling of code-switching or mixing English with local languages
  • Degraded performance in very noisy or low-quality audio environments

Neither Awarri Technologies nor the Federal Government of Nigeria shall be liable for damages arising from the use of NigerianAccentedEnglish.


6. Ethical & Cultural Considerations

Users must:

  • Respect Nigeria’s cultural and linguistic diversity
  • Ensure transparent reporting of accuracy, bias, and limitations
  • Uphold human rights and privacy standards in all deployments

7. Data & Privacy

  • All training data used in NigerianAccentedEnglish was either publicly available or government-approved for use.
  • Users are strictly prohibited from using the model for unauthorized personal data scraping, collection, or profiling.

8. Governance & Updates

  • Governance and oversight are led by the Federal Ministry of Communications, Innovation, and Digital Economy, in collaboration with the National Centre for Artificial Intelligence & Robotics (NCAIR).
  • Awarri Technologies shall act as the technical maintainer and custodian of NigerianAccentedEnglish.
  • Updates, improvements, and community contributions will be published periodically.
  • Users must comply with the specific Terms attached to each version release.

9. Legal & Jurisdiction

  • These Terms are governed by the laws of the Federal Republic of Nigeria.
  • In the event of a dispute, parties agree to seek resolution first through mediation under the auspices of the Federal Ministry of Justice, before pursuing litigation in Nigerian courts.

10. Termination

The Federal Government of Nigeria and Awarri Technologies reserve the right to revoke, suspend, or terminate usage rights if these Terms are violated.

Termination may apply to individual users, institutions, or organizations found in breach.


11. Contact & Attribution

For licensing, inquiries, and commercial partnerships regarding NigerianAccentedEnglish, contact:

Awarri Technologies

Federal Ministry of Communications, Innovation, and Digital Economy

Required attribution in all public use:

“NigerianAccentedEnglish is powered by Awarri Technologies and an initiative of the Federal Ministry of Communications, Innovation and Digital Economy.”

If renamed, the model must carry the suffix:

“Powered by Awarri.”


Keywords: Nigerian English, Accent Recognition, Speech-to-Text, West African English, Inclusive AI, Digital Inclusion

Downloads last month
123
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for NCAIR1/NigerianAccentedEnglish

Finetuned
(2947)
this model

Spaces using NCAIR1/NigerianAccentedEnglish 2