GetmanY1 commited on
Commit
45ee416
·
verified ·
1 Parent(s): e2eced5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -10
README.md CHANGED
@@ -6,7 +6,7 @@ tags:
6
  - fi
7
  - finnish
8
  model-index:
9
- - name: wav2vec2-base-fi-voxpopuli-v2-1500h
10
  results:
11
  - task:
12
  name: Automatic Speech Recognition
@@ -18,25 +18,25 @@ model-index:
18
  metrics:
19
  - name: Dev WER
20
  type: wer
21
- value: 22.18
22
  - name: Dev CER
23
  type: cer
24
- value: 5.96
25
  - name: Test WER
26
  type: wer
27
- value: 24.43
28
  - name: Test CER
29
  type: cer
30
- value: 6.97
31
  ---
32
- # Colloquial Finnish Wav2vec2-Base ASR
33
 
34
- [facebook/wav2vec2-base-fi-voxpopuli-v2](https://huggingface.co/facebook/wav2vec2-base-fi-voxpopuli-v2) fine-tuned on 1500 hours of [Lahjoita puhetta (Donate Speech)](https://link.springer.com/article/10.1007/s10579-022-09606-3) on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
35
 
36
 
37
  ## Model description
38
 
39
- The Finnish Wav2Vec2 Base has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 2600 hours of unlabeled colloquial Finnish speech from [Lahjoita puhetta (Donate Speech)](https://link.springer.com/article/10.1007/s10579-022-09606-3).
40
 
41
  You can read more about the pre-trained model from [this paper](TODO). The training scripts are available on [GitHub](https://github.com/aalto-speech/colloquial-Finnish-wav2vec2)
42
 
@@ -54,8 +54,8 @@ from datasets import load_dataset
54
  import torch
55
 
56
  # load model and processor
57
- processor = Wav2Vec2Processor.from_pretrained("wav2vec2-base-fi-voxpopuli-v2-1500h")
58
- model = Wav2Vec2ForCTC.from_pretrained("wav2vec2-base-fi-voxpopuli-v2-1500h")
59
 
60
  # load dummy dataset and read soundfiles
61
  ds = load_dataset("mozilla-foundation/common_voice_16_1", "fi", split='test')
 
6
  - fi
7
  - finnish
8
  model-index:
9
+ - name: wav2vec2-large-uralic-voxpopuli-v2-1500h
10
  results:
11
  - task:
12
  name: Automatic Speech Recognition
 
18
  metrics:
19
  - name: Dev WER
20
  type: wer
21
+ value: 19.14
22
  - name: Dev CER
23
  type: cer
24
+ value: 5.05
25
  - name: Test WER
26
  type: wer
27
+ value: 20.49
28
  - name: Test CER
29
  type: cer
30
+ value: 5.93
31
  ---
32
+ # Colloquial Finnish Wav2vec2-Large ASR
33
 
34
+ [facebook/wav2vec2-large-uralic-voxpopuli-v2](https://huggingface.co/facebook/wav2vec2-large-uralic-voxpopuli-v2) fine-tuned on 1500 hours of [Lahjoita puhetta (Donate Speech)](https://link.springer.com/article/10.1007/s10579-022-09606-3) on 16kHz sampled speech audio. When using the model make sure that your speech input is also sampled at 16Khz.
35
 
36
 
37
  ## Model description
38
 
39
+ The Finnish Wav2Vec2 Large has the same architecture and uses the same training objective as the English and multilingual one described in [Paper](https://arxiv.org/abs/2006.11477). It is pre-trained on 2600 hours of unlabeled colloquial Finnish speech from [Lahjoita puhetta (Donate Speech)](https://link.springer.com/article/10.1007/s10579-022-09606-3).
40
 
41
  You can read more about the pre-trained model from [this paper](TODO). The training scripts are available on [GitHub](https://github.com/aalto-speech/colloquial-Finnish-wav2vec2)
42
 
 
54
  import torch
55
 
56
  # load model and processor
57
+ processor = Wav2Vec2Processor.from_pretrained("GetmanY1/wav2vec2-large-uralic-voxpopuli-v2-1500h")
58
+ model = Wav2Vec2ForCTC.from_pretrained("GetmanY1/wav2vec2-large-uralic-voxpopuli-v2-1500h")
59
 
60
  # load dummy dataset and read soundfiles
61
  ds = load_dataset("mozilla-foundation/common_voice_16_1", "fi", split='test')