Commit
·
f6ca04e
1
Parent(s):
7d3ab33
Improve description of the system
Browse files
README.md
CHANGED
|
@@ -44,16 +44,17 @@ model-index:
|
|
| 44 |
value: 11.26
|
| 45 |
---
|
| 46 |
|
| 47 |
-
#
|
| 48 |
|
| 49 |
-
This model is a version of [facebook/wav2vec2-xls-r-2b-22-to-16](https://huggingface.co/facebook/wav2vec2-xls-r-2b-22-to-16) fine-tuned mainly on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below).
|
| 50 |
-
It achieves the following results on the evaluation set (of Common Voice 8.0):
|
| 51 |
- Wer: 0.0669
|
| 52 |
- Cer: 0.0197
|
| 53 |
|
| 54 |
## Model description
|
| 55 |
|
| 56 |
-
The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the final result.
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## Intended uses & limitations
|
| 59 |
|
|
|
|
| 44 |
value: 11.26
|
| 45 |
---
|
| 46 |
|
| 47 |
+
# XLS-R-based CTC model with 5-gram language model from Common Voice
|
| 48 |
|
| 49 |
+
This model is a version of [facebook/wav2vec2-xls-r-2b-22-to-16](https://huggingface.co/facebook/wav2vec2-xls-r-2b-22-to-16) fine-tuned mainly on the MOZILLA-FOUNDATION/COMMON_VOICE_8_0 - NL dataset (see details below), on which a small 5-gram language model is added based on the Common Voice training corpus. This model achieves the following results on the evaluation set (of Common Voice 8.0):
|
|
|
|
| 50 |
- Wer: 0.0669
|
| 51 |
- Cer: 0.0197
|
| 52 |
|
| 53 |
## Model description
|
| 54 |
|
| 55 |
+
The model takes 16kHz sound input, and uses a Wav2Vec2ForCTC decoder with 48 letters to output the final result.
|
| 56 |
+
|
| 57 |
+
To improve accuracy, a beam decoder is used; the beams are scored based on 5-gram language model trained on the Common Voice 8 corpus.
|
| 58 |
|
| 59 |
## Intended uses & limitations
|
| 60 |
|