---
language:
- rw
license: cc-by-4.0
library_name: nemo
datasets:
- DigitalUmuganda/Afrivoice_Kinyarwanda
thumbnail: null
tags:
- automatic-speech-recognition
- speech
- audio
- CTC
- Conformer
- NeMo
- pytorch

---


## Model Overview

<DESCRIBE IN ONE LINE THE MODEL AND ITS USE>

## Dependencies

To train, fine-tune or play with the model you will need to install [NVIDIA NeMo](https://github.com/NVIDIA/NeMo). 

For inference just run:
```
pip install nemo_toolkit['all']
``` 

## How to Use this Model

The model is available for use in the NeMo toolkit, and can be used as a pre-trained checkpoint for inference or for fine-tuning on another dataset.

### Load the model weights

```python
import nemo.collections.asr as nemo_asr
asr_model = nemo_asr.models.ASRModel.from_pretrained("DigitalUmuganda/Mbaza-ASR-Afrivoice-660h")
```

### Transcribing using Python

```
asr_model.transcribe(['<audio_sample>'])
```

### Transcribing many audio files

```shell
python [NEMO_GIT_FOLDER]/examples/asr/transcribe_speech.py  pretrained_name="DigitalUmuganda/nemo_kin_pretrained_800h_retrained_tokenizer"  audio_dir="<DIRECTORY CONTAINING AUDIO FILES>"
```

### Input

This model accepts 16000 KHz Mono-channel Audio (wav files) as input.