ESPnet2 ASR model
espnet/shihlun-asr-commonvoice-zh-TW
This model was trained by Shih-Lun Wu using the commonvoice recipe in espnet.
Demo: How to use in ESPnet2
cd espnet
pip install -e .
cd egs2/commonvoice/asr1
./asr.sh \
--stage 1 \
--stop_stage 13 \
--nj 32 \
--inference_nj 32 \
--skip_train true \
--train_set "train_zh_TW" \
--valid_set "dev_zh_TW" \
--test_sets "dev_zh_TW test_zh_TW" \
--lang "zh_TW" \
--local_data_opts "--lang zh-TW" \
--speed_perturb_factors "0.9 1.0 1.1" \
--lm_train_text "data/train_zh_TW/text" \
--token_type bpe \
--nbpe 2542 \
--bpemode "unigram" \
--bpe_train_text "data/train_zh_TW/text" \
--use_lm false \
--inference_asr_model "valid.acc.best.pth" \
--download_model "espnet/shihlun-asr-commonvoice-zh-TW"
RESULTS
Environments
- date:
Thu Sep 1 21:49:10 UTC 2022
- python version:
3.9.12 (main, Jun 1 2022, 11:38:51) [GCC 7.5.0]
- espnet version:
espnet 202207
- pytorch version:
pytorch 1.12.1+cu102
- Git hash:
13db69d3befc3c82a5ff5a11e28bf79d5030603f
- Commit date:
Mon Aug 29 13:44:35 2022 +0000
asr_train_asr_conformer5_raw_zh_TW_bpe2542_sp_lr1.0
CER
dataset |
Snt |
Wrd |
Corr |
Sub |
Del |
Ins |
Err |
S.Err |
inference_asr_model_valid.acc.best/dev_zh_TW |
2627 |
22200 |
97.7 |
2.1 |
0.2 |
0.0 |
2.4 |
9.5 |
inference_asr_model_valid.acc.best/test_zh_TW |
2627 |
21991 |
98.0 |
1.6 |
0.4 |
0.1 |
2.1 |
7.7 |
TER
dataset |
Snt |
Wrd |
Corr |
Sub |
Del |
Ins |
Err |
S.Err |
inference_asr_model_valid.acc.best/dev_zh_TW |
2627 |
24827 |
98.6 |
1.2 |
0.2 |
0.0 |
1.5 |
4.0 |
inference_asr_model_valid.acc.best/test_zh_TW |
2627 |
24618 |
98.8 |
0.9 |
0.4 |
0.1 |
1.3 |
3.4 |