African Kikuyu Text-to-Speech Model

Model Description

This is a Text-to-Speech (TTS) model for the Kikuyu language, one of the major languages spoken in Kenya. The model is based on Facebook's Massively Multilingual Speech (MMS) project and has been redeployed here for better discoverability and accessibility.

Language Information

Language: Kikuyu (Gĩkũyũ)
Language Code: ki (ISO 639-1), kik (ISO 639-3)
Family: Niger-Congo, Atlantic-Congo, Volta-Congo, Benue-Congo, Bantoid, Southern, Narrow Bantu, Central, J, Kikuyu-Kamba (J.10)
Speakers: Approximately 8.1 million native speakers
Region: Central Kenya

Model Details

Model Type: VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech)
Base Model: facebook/mms-tts-kik
Training Data: Part of Facebook's MMS dataset
Supported Task: Text-to-Speech synthesis for Kikuyu language

Usage

Using Transformers

from transformers import VitsModel, VitsTokenizer
import torch
import scipy.io.wavfile
import numpy as np

# Load model and tokenizer
model = VitsModel.from_pretrained("BrianMwangi/African-Kikuyu-TTS")
tokenizer = VitsTokenizer.from_pretrained("BrianMwangi/African-Kikuyu-TTS")

# Example text in Kikuyu
text = "Mũthenya ũmwe, njũgũ ya ita yakoragwo na atumia a njũri cia kĩrĩra"

# Generate speech
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

# Save audio file
audio = outputs.waveform.squeeze().cpu().numpy()
scipy.io.wavfile.write("kikuyu_speech.wav", rate=model.config.sampling_rate, data=audio)

Using Pipeline

from transformers import pipeline

# Create TTS pipeline
tts = pipeline("text-to-speech", model="BrianMwangi/African-Kikuyu-TTS")

# Generate speech
speech = tts("Mũthenya ũmwe, njũgũ ya ita yakoragwo na atumia a njũri cia kĩrĩra")

Examples

Here are some example phrases in Kikuyu:

"Wĩ mwega" - "You are good"
"Nĩ kĩĩ gĩtũmi?" - "What is the reason?"
"Ndĩ na gĩkeno nĩ ũndũ waku" - "I am happy because of you"

Limitations

The model quality depends on the training data from the original MMS dataset
May not perform well on words or phrases not present in the training data
Performance may vary with different dialects of Kikuyu

Ethical Considerations

This model is intended to:

Promote and preserve the Kikuyu language in digital spaces
Support language learning and accessibility
Enable TTS applications for Kikuyu speakers

Citation

If you use this model, please cite both this repository and the original MMS paper:

@article{pratap2023mms,
  title={Scaling Speech Technology to 1,000+ Languages},
  author={Pratap, Vineel and Tjandra, Andros and Shi, Bowen and Tomasello, Paden and Babu, Arun and Kundu, Sayani and Elkahky, Ali and Ni, Zhaoheng and Vyas, Apoorv and Conneau, Alexis and others},
  journal={arXiv preprint arXiv:2305.13516},
  year={2023}
}

Acknowledgments

Facebook AI Research: For developing the MMS models
Original Model: facebook/mms-tts-kik
Community: Kikuyu language speakers and African NLP community

Contact

For questions or issues related to this model deployment, please open an issue in this repository.

This model is a redistribution of Facebook's MMS Kikuyu TTS model, repackaged for better discoverability and ease of use.

BrianMwangi
/

African-Kikuyu-TTS