Important Note:

This model is for streaming purpose, to reduce false positive results(hallucinations) in silent/background noise
The noise is labeled as "%nz", please remove/replace this pattern in the transcribe result

2025-07-21: CER:

Dataset	Lang	Split	CER(in %)
Training	yue	validation	3.234
mozilla-foundation/common_voice_17_0	yue	test	0.437
mozilla-foundation/common_voice_17_0	en	test(2k samples)	5.18
mozilla-foundation/common_voice_16_1	zh-CN	test	11.74
JackyHoCL/cleaned_mixed_cantonese_and_english_speech	yue	test	9.86
Sunbird/urban-noise-uganda-61k:small(1k)	noise	half(500)	12.7

2025-07-06: CER:

Dataset	Lang	Split	CER(in %)
Training	yue	validation	8.95
mozilla-foundation/common_voice_17_0	yue	test	8.78
mozilla-foundation/common_voice_16_1	yue	test	8.76
JackyHoCL/cleaned_mixed_cantonese_and_english_speech	yue	test	8.00
Sunbird/urban-noise-uganda-61k:small(1k)	noise	half(500)	0.0

Train Args:

per_device_train_batch_size=32,
learning_rate=1e-6,
gradient_accumulation_steps=1,
gradient_checkpointing=True,
per_device_eval_batch_size=16,
generation_max_length=225,

Hardware:
NVIDIA Tesla V100 16GB * 4

A Realtime Streaming application example is built on this model:
https://github.com/JackyHoCL/whisper-realtime.git

FAQ:

If having tokenizer issue during inference, please update your transformers version to >= 4.46.3

pip install --upgrade transformers

Downloads last month: 40

Safetensors

Model size

809M params

Tensor type

F32

Model tree for JackyHoCL/whisper-large-v3-turbo-cantonese-noise-detection

Base model

openai/whisper-large-v3

Finetuned

openai/whisper-large-v3-turbo

Finetuned

JackyHoCL/whisper-large-v3-turbo-cantonese-yue-english

Finetuned

(2)

this model

JackyHoCL
/

whisper-large-v3-turbo-cantonese-noise-detection

Important Note:

Model tree for JackyHoCL/whisper-large-v3-turbo-cantonese-noise-detection

Datasets used to train JackyHoCL/whisper-large-v3-turbo-cantonese-noise-detection