Welcome
If you find this model helpful, please like this model and star us on https://github.com/LianjiaTech/BELLE and https://github.com/shuaijiang/Whisper-Finetune
Belle-whisper-large-v3-zh
Fine tune whisper-large-v3 to enhance Chinese speech recognition capabilities, Belle-whisper-large-v3-zh demonstrates a 24-65% relative improvement in performance on Chinese ASR benchmarks, including AISHELL1, AISHELL2, WENETSPEECH, and HKUST.
Usage
from transformers import pipeline
transcriber = pipeline(
"automatic-speech-recognition",
model="BELLE-2/Belle-whisper-large-v3-zh"
)
transcriber.model.config.forced_decoder_ids = (
transcriber.tokenizer.get_decoder_prompt_ids(
language="zh",
task="transcribe"
)
)
transcription = transcriber("my_audio.wav")
Fine-tuning
Model | (Re)Sample Rate | Train Datasets | Fine-tuning (full or peft) |
---|---|---|---|
Belle-whisper-large-v3-zh | 16KHz | AISHELL-1 AISHELL-2 WenetSpeech HKUST | full fine-tuning |
If you want to fine-thuning the model on your datasets, please reference to the github repo
CER(%) β
Model | Language Tag | aishell_1_test(β) | aishell_2_test(β) | wenetspeech_net(β) | wenetspeech_meeting(β) | HKUST_dev(β) |
---|---|---|---|---|---|---|
whisper-large-v3 | Chinese | 8.085 | 5.475 | 11.72 | 20.15 | 28.597 |
Belle-whisper-large-v2-zh | Chinese | 2.549 | 3.746 | 8.503 | 14.598 | 16.289 |
Belle-whisper-large-v3-zh | Chinese | 2.781 | 3.786 | 8.865 | 11.246 | 16.440 |
It is worth mentioning that compared to Belle-whisper-large-v2-zh, Belle-whisper-large-v3-zh has a significant improvement in complex acoustic scenes(such as wenetspeech_meeting).
Citation
Please cite our paper and github when using our code, data or model.
@misc{BELLE,
author = {BELLEGroup},
title = {BELLE: Be Everyone's Large Language model Engine},
year = {2023},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/LianjiaTech/BELLE}},
}
- Downloads last month
- 543
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.