Model Card for Malay-English Fine-Tuned ASR Model

This model was fine-tuned on approximately 50 hours of manually curated Malay-English code-switched audio data for 10 epochs. It achieves a Word Error Rate (WER) of 17.26% on a held-out evaluation set vs 34.29% with base model.

Model Details

Model Description

This is a fine-tuned version of the mesolitica/wav2vec2-xls-r-300m-mixed model on a custom Malay-English dataset. It is designed to transcribe speech that includes both Malay and English, especially in informal or conversational contexts where code-switching is common.

Model Sources

Uses

Direct Use

This model can be used to transcribe conversational Malay-English audio recordings, especially in domains such as:

  • Broadcast interviews
  • YouTube vlogs
  • Podcasts
  • Community recordings

Downstream Use

The model can be fine-tuned further or used as part of downstream applications such as:

  • Real-time transcription services
  • Voice assistants tailored for Malaysian users
  • Speech-driven translation systems

Out-of-Scope Use

  • High-stakes transcription scenarios (e.g., legal or medical contexts) where exact word accuracy is critical
  • Non-Malay, non-English languages
  • Noisy or far-field audio environments (unless fine-tuned further)

Bias, Risks, and Limitations

Known Limitations

  • May underperform on accents or dialects not well-represented in training data
  • Inconsistent casing or punctuation handling (model is CTC-based)
  • Limited robustness to background noise or overlapping speakers

Recommendations

  • Always verify outputs for critical tasks
  • Pair with punctuation restoration or diarization for production-grade use
  • Retrain with domain-specific data for higher accuracy

How to Get Started with the Model

from transformers import pipeline

asr = pipeline("automatic-speech-recognition", model="langminer/wav2vec2-custom-asr")
transcription = asr("your_audio_file.wav")
print(transcription)
Downloads last month
4
Safetensors
Model size
316M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for langminer/wav2vec2-custom-asr

Finetuned
(1)
this model