ASR model
Collection
Automated Speech Recognition model
•
2 items
•
Updated
This repository is publicly accessible, but you have to accept the conditions to access its files and content.
Please read CC-BY-NC-4.0 before downloading this model.
Log in or Sign Up to review the conditions and access this model content.
imprt/kushinada-hubert-large-lavorotv2-asr
This model is an ESPnet2 ASR model using imprt/kushinada-hubert-large trained on LaboroTVSpeech2 and CSJ using espnet.
cd espnet
pip install -e .
cd egs2/laborotv/asr1
# copy all files
# store imprt/kushinada-hubert-large s3prl/kushinada-hubert-large-s3prl.pt to exp/ directory.
# run tedx-jp-10k data preparation
#
./run_v2.sh --skip_data_prep true --skip-train true
Fri Mar 7 11:10:00 JST 2025
3.10.14 (main, Jul 10 2024, 13:18:49) [GCC 13.2.0]
espnet 202402
pytorch 2.3.1+cu121
19787b1793eda2b4007aa5b2c4d03adf6c18abfb
Fri Jun 14 19:27:35 2024 +0900
dataset | Snt | Wrd | Corr | Sub | Del | Ins | Err | S.Err |
---|---|---|---|---|---|---|---|---|
decode_asr_lm_lm_train_lm_v2_jp_char_valid.loss.ave_asr_model_valid.acc.ave/tedx-jp-10k | 10000 | 190568 | 91.0 | 4.4 | 4.6 | 1.9 | 10.9 | 57.4 |
@inproceedings{watanabe2018espnet,
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
title={{ESPnet}: End-to-End Speech Processing Toolkit},
year={2018},
booktitle={Proceedings of Interspeech},
pages={2207--2211},
doi={10.21437/Interspeech.2018-1456},
url={http://dx.doi.org/10.21437/Interspeech.2018-1456}
}
or arXiv:
@misc{watanabe2018espnet,
title={ESPnet: End-to-End Speech Processing Toolkit},
author={Shinji Watanabe and Takaaki Hori and Shigeki Karita and Tomoki Hayashi and Jiro Nishitoba and Yuya Unno and Nelson Yalta and Jahn Heymann and Matthew Wiesner and Nanxin Chen and Adithya Renduchintala and Tsubasa Ochiai},
year={2018},
eprint={1804.00015},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
@INPROCEEDINGS{9413425,
author={Ando, Shintaro and Fujihara, Hiromasa},
booktitle={ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
title={Construction of a Large-Scale Japanese ASR Corpus on TV Recordings},
year={2021},
volume={},
number={},
pages={6948-6952},
keywords={Training;TV;Buildings;Speech recognition;Signal processing;Acoustics;Iterative methods;Automatic speech recognition;Corpus},
doi={10.1109/ICASSP39728.2021.9413425}}
@inproceedings{maekawa03_sspr,
title = {Corpus of spontaneous Japanese: its design and evaluation},
author = {Kikuo Maekawa},
year = {2003},
booktitle = {ISCA/IEEE Workshop on Spontaneous Speech Processing and Recognition},
pages = {paper MMO2},
}
Base model
imprt/kushinada-hubert-large