Whisper Small Chinese Base
This model is a fine-tuned version of openai/whisper-small
on the MAGICDATA Mandarin Chinese Conversational Speech Corpus
dataset. It achieves the following results on the evaluation set:
Loss: 0.4830
CER: 20.50
Model Description & Intended Uses
This model is trained on a 180-hour conversational speech dataset, making it suitable for scenarios such as voice assistance.
Also, the training arguments ensure good generalization ability and prevent overfitting, with the model achieving its best performance around the second training epoch.
该模型在180小时的会话语音数据集上进行训练,适用于语音助手等场景。
此外,该模型的训练参数保证了良好的泛化能力并避免过拟合,最佳表现大约出现在第二个训练周期左右。
Limitations
A disclaimer: this model still has a relatively small number of parameters, meaning its performance is difficult to compete with larger ASR models. Additionally, since the base model Whisper Small is multilingual, it is unrealistic to expect this fine-tuned version to surpass native Chinese-based ASR models in Chinese transcription accuracy.
不过需要说明的是,该模型参数量仍然较小,因此性能难以与更大型的自动语音识别(ASR)模型竞争。
另外,作为基模型,Whisper Small 是一个多语言模型,因此不太可能指望该微调版本在中文转录能力上超过专门针对中文的 ASR 模型。
Training Hyperparameters
The following hyperparameters were used during training:
Hyperparameter | Value |
---|---|
learning_rate | 5e-6 |
train_batch_size | 1 |
gradient_accumulation_steps | 16 |
eval_batch_size | 3 |
warmup_steps | 600 |
weight_decay | 0.01 |
max_steps | 36000 |
gradient_checkpointing | False |
eval_strategy | steps |
save_steps | 3000 |
eval_steps | 3000 |
logging_steps | 100 |
load_best_model_at_end | True |
metric_for_best_model | CER |
greater_is_better | False |
report_to | TensorBoard |
dataloader_pin_memory | False (not CUDA, so pin memory off) |
Model Configuration
The following dropout hyperparameters were set in the model configuration:
Parameter | Value |
---|---|
dropout | 0.2 |
attention_dropout | 0.2 |
activation_dropout | 0.2 |
Training Results
Epoch | Validation Loss | CER (%) |
---|---|---|
0.26 | 0.4443 | 23.11 |
0.52 | 0.4358 | 22.27 |
0.79 | 0.4367 | 22.53 |
1.05 | 0.4733 | 22.55 |
1.31 | 0.4493 | 21.67 |
1.57 | 0.4595 | 21.57 |
1.84 | 0.4632 | 21.56 |
2.10 | 0.4830 | 20.50 |
2.36 | 0.4676 | 21.02 |
2.62 | 0.4820 | 22.66 |
2.89 | 0.4846 | 21.34 |
3.15 | 0.4976 | 21.50 |
Framework Versions
Library | Version |
---|---|
Transformers | 4.53.3 |
PyTorch | 2.7.1 |
Datasets | 4.0.0 |
Tokenizers | 0.21.2 |
- Downloads last month
- 60
Model tree for AntiPollo/whisper-small-zh
Base model
openai/whisper-small