Important Note:
This model is for streaming purpose, to reduce false positive results(hallucinations) in silent/background noise
The noise is labeled as "%nz", please remove/replace this pattern in the transcribe result
2025-07-21: CER:
Dataset | Lang | Split | CER(in %) |
---|---|---|---|
Training | yue | validation | 3.234 |
mozilla-foundation/common_voice_17_0 | yue | test | 0.437 |
mozilla-foundation/common_voice_17_0 | en | test(2k samples) | 5.18 |
mozilla-foundation/common_voice_16_1 | zh-CN | test | 11.74 |
JackyHoCL/cleaned_mixed_cantonese_and_english_speech | yue | test | 9.86 |
Sunbird/urban-noise-uganda-61k:small(1k) | noise | half(500) | 12.7 |
2025-07-06: CER:
Dataset | Lang | Split | CER(in %) |
---|---|---|---|
Training | yue | validation | 8.95 |
mozilla-foundation/common_voice_17_0 | yue | test | 8.78 |
mozilla-foundation/common_voice_16_1 | yue | test | 8.76 |
JackyHoCL/cleaned_mixed_cantonese_and_english_speech | yue | test | 8.00 |
Sunbird/urban-noise-uganda-61k:small(1k) | noise | half(500) | 0.0 |
Train Args:
per_device_train_batch_size=32,
learning_rate=1e-6,
gradient_accumulation_steps=1,
gradient_checkpointing=True,
per_device_eval_batch_size=16,
generation_max_length=225,
Hardware:
NVIDIA Tesla V100 16GB * 4
A Realtime Streaming application example is built on this model:
https://github.com/JackyHoCL/whisper-realtime.git
FAQ:
- If having tokenizer issue during inference, please update your transformers version to >= 4.46.3
pip install --upgrade transformers
- Downloads last month
- 43
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for JackyHoCL/whisper-large-v3-turbo-cantonese-noise-detection
Base model
openai/whisper-large-v3
Finetuned
openai/whisper-large-v3-turbo