KoSP2E ASR Recipe

This is the ESPnet2 recipe for the KoSP2E (Korean Speech Perception and Production Experiment) dataset.


Overview

The KoSP2E dataset is a large-scale Korean speech corpus designed for speech perception and production experiments. This recipe provides a full ASR pipeline using ESPnet2 with both Transformer and Conformer architectures.


Results

Environment

  • Date: Mon Nov 10 20:35:20 UTC 2025
  • Python: 3.10.19
  • ESPnet: 202509
  • PyTorch: 2.9.0+cu128
  • Model: Conformer (BPE=2000)
  • Decode: Transformer LM (valid.acc.ave)

WER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
test 2320 22337 77.1 20.4 2.6 4.4 27.4 76.4

CER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
test 2320 84267 92.5 5.7 1.8 1.7 9.2 76.4

TER

dataset Snt Wrd Corr Sub Del Ins Err S.Err
test 2320 65361 89.4 8.6 2.0 2.1 12.7 76.4

References


Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support