African Kikuyu Text-to-Speech Model

Model Description

This is a Text-to-Speech (TTS) model for the Kikuyu language, one of the major languages spoken in Kenya. The model is based on Facebook's Massively Multilingual Speech (MMS) project and has been redeployed here for better discoverability and accessibility.

Language Information

  • Language: Kikuyu (Gĩkũyũ)
  • Language Code: ki (ISO 639-1), kik (ISO 639-3)
  • Family: Niger-Congo, Atlantic-Congo, Volta-Congo, Benue-Congo, Bantoid, Southern, Narrow Bantu, Central, J, Kikuyu-Kamba (J.10)
  • Speakers: Approximately 8.1 million native speakers
  • Region: Central Kenya

Model Details

  • Model Type: VITS (Variational Inference with adversarial learning for end-to-end Text-to-Speech)
  • Base Model: facebook/mms-tts-kik
  • Training Data: Part of Facebook's MMS dataset
  • Supported Task: Text-to-Speech synthesis for Kikuyu language

Usage

Using Transformers

from transformers import VitsModel, VitsTokenizer
import torch
import scipy.io.wavfile
import numpy as np

# Load model and tokenizer
model = VitsModel.from_pretrained("BrianMwangi/African-Kikuyu-TTS")
tokenizer = VitsTokenizer.from_pretrained("BrianMwangi/African-Kikuyu-TTS")

# Example text in Kikuyu
text = "Mũthenya ũmwe, njũgũ ya ita yakoragwo na atumia a njũri cia kĩrĩra"

# Generate speech
inputs = tokenizer(text, return_tensors="pt")
with torch.no_grad():
    outputs = model(**inputs)

# Save audio file
audio = outputs.waveform.squeeze().cpu().numpy()
scipy.io.wavfile.write("kikuyu_speech.wav", rate=model.config.sampling_rate, data=audio)

Using Pipeline

from transformers import pipeline

# Create TTS pipeline
tts = pipeline("text-to-speech", model="BrianMwangi/African-Kikuyu-TTS")

# Generate speech
speech = tts("Mũthenya ũmwe, njũgũ ya ita yakoragwo na atumia a njũri cia kĩrĩra")

Examples

Here are some example phrases in Kikuyu:

  • "Wĩ mwega" - "You are good"
  • "Nĩ kĩĩ gĩtũmi?" - "What is the reason?"
  • "Ndĩ na gĩkeno nĩ ũndũ waku" - "I am happy because of you"

Limitations

  • The model quality depends on the training data from the original MMS dataset
  • May not perform well on words or phrases not present in the training data
  • Performance may vary with different dialects of Kikuyu

Ethical Considerations

This model is intended to:

  • Promote and preserve the Kikuyu language in digital spaces
  • Support language learning and accessibility
  • Enable TTS applications for Kikuyu speakers

Citation

If you use this model, please cite both this repository and the original MMS paper:

@article{pratap2023mms,
  title={Scaling Speech Technology to 1,000+ Languages},
  author={Pratap, Vineel and Tjandra, Andros and Shi, Bowen and Tomasello, Paden and Babu, Arun and Kundu, Sayani and Elkahky, Ali and Ni, Zhaoheng and Vyas, Apoorv and Conneau, Alexis and others},
  journal={arXiv preprint arXiv:2305.13516},
  year={2023}
}

Acknowledgments

  • Facebook AI Research: For developing the MMS models
  • Original Model: facebook/mms-tts-kik
  • Community: Kikuyu language speakers and African NLP community

Contact

For questions or issues related to this model deployment, please open an issue in this repository.


This model is a redistribution of Facebook's MMS Kikuyu TTS model, repackaged for better discoverability and ease of use.

Downloads last month
14
Safetensors
Model size
36.3M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train BrianMwangi/African-Kikuyu-TTS