Update README.md
Browse files
README.md
CHANGED
@@ -24,16 +24,16 @@ model-index:
|
|
24 |
metrics:
|
25 |
- name: Test WER
|
26 |
type: wer
|
27 |
-
value:
|
28 |
---
|
29 |
# Wav2vec 2.0 large VoxRex Swedish
|
30 |
|
31 |
-
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **3.
|
32 |
|
33 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
34 |
|
35 |
## Training
|
36 |
-
This model has additionally pretrained on 3500h of a mix of Swedish local radio broadcasts, audio books and other audio sources. It has been fine-tuned for 120000 updates on NST + CommonVoice
|
37 |
|
38 |

|
39 |
|
|
|
24 |
metrics:
|
25 |
- name: Test WER
|
26 |
type: wer
|
27 |
+
value: 9.914
|
28 |
---
|
29 |
# Wav2vec 2.0 large VoxRex Swedish
|
30 |
|
31 |
+
Finetuned version of KBs [VoxRex large](https://huggingface.co/KBLab/wav2vec2-large-voxrex) model using Swedish radio broadcasts, NST and Common Voice data. Evalutation without a language model gives the following: WER for NST + Common Voice test set (2% of total sentences) is **3.617%**. WER for Common Voice test set is **9.914%** directly and **7.77%** with a 4-gram language model.
|
32 |
|
33 |
When using this model, make sure that your speech input is sampled at 16kHz.
|
34 |
|
35 |
## Training
|
36 |
+
This model has additionally pretrained on 3500h of a mix of Swedish local radio broadcasts, audio books and other audio sources. It has been fine-tuned for 120000 updates on NST + CommonVoice and then for an additional 20000 updates on CommonVoice only. The additional fine-tuning on CommonVoice hurts performance on the NST+CommonVoice test set somewhat and, unsurprisingly, improves it on the CommonVoice test set. It seems to perform generally better though [citation needed].
|
37 |
|
38 |

|
39 |
|