T5 SCT model
ouktlab/t5_sct-jis-v1_corpus10-bccwj-wiki40b_std
This is a Japanese syllable-to-character translation (SCT) model for character (Kanji, Katakana and Hiragana) recognition.
- This model is based on T5 architecture.
- This Japanese character tokenizer (v1) based on JIS X 0213 is assumed.
- The details and examples are shown at our github repository.
Citations
@inproceedings {rtakeda2025:apsipa,
author={Ryu Takeda and Kazunori Komatani},
title={Reducing Orthographic Dependency on Paired Data by Probabilistic Integration via Syllabogram for Japanese Dialogue Speech Recognition},
year={2025},
booktitle={Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (to appear)},
}
- Downloads last month
- 1
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support