Bartelds
				
			
		Upload checkpoint, sanitized config, and transcripts for ctc-baseline_xlsr_set_2-extra
		7db9ab5
		
		metadata
			title: CTC-Baseline XLSR-based ASR model - set 2-extra
language: multilingual
tags:
  - asr
  - ctc-dro
  - XLSR
license: cc-by-nc-4.0
CTC-DRO XLSR-based ASR model - set 2-extra
This repository contains a CTC-Baseline XLSR-based automatic speech recognition (ASR) model trained with ESPnet.
The model was trained on unbalanced training data from set 2-extra.
Intended Use
This model is intended for ASR. Users can run inference using the provided checkpoint (valid.loss.best.pth) and configuration file (config.yaml):
import soundfile as sf
from espnet2.bin.asr_inference import Speech2Text
asr_train_config = "ctc-baseline_xlsr_set_2-extra/config.yaml"
asr_model_file = "ctc-baseline_xlsr_set_2-extra/valid.loss.best.pth"
model = Speech2Text.from_pretrained(
    asr_train_config=asr_train_config,
    asr_model_file=asr_model_file
)
speech, _ = sf.read("input.wav")
text, *_ = model(speech)[0]
print("Recognized text:", text)
How to Use
- Clone this repository.
- Use ESPnet’s inference scripts with the provided config.yamland checkpoint file.
- Ensure any external resources referenced in config.yamlare available at the indicated relative paths.
