update readme and add plots
Browse filesSigned-off-by: monica-sekoyan <[email protected]>
- .gitattributes +1 -0
- README.md +9 -9
- plots/asr.png +3 -0
- plots/en_x.png +3 -0
- plots/x_en.png +3 -0
.gitattributes
CHANGED
@@ -34,3 +34,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
canary-1b-v2.nemo filter=lfs diff=lfs merge=lfs -text
|
|
|
|
34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
36 |
canary-1b-v2.nemo filter=lfs diff=lfs merge=lfs -text
|
37 |
+
*.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
@@ -42,7 +42,7 @@ We will soon release a comprehensive **Canary-1b-v2 technical report** detailing
|
|
42 |
|
43 |
### Automatic Speech Recognition (ASR)
|
44 |
|
45 |
-
 using MUSAN music and noise samples \[16] on the [LibriSpeech Clean test set](https://www.openslr.org/12).
|
327 |
**Metric**: Word Error Rate (**WER**)
|
328 |
|
329 |
-
| **SNR (dB)**
|
330 |
-
| ---------------
|
331 |
-
| **`Canary-1b-v2`** | 2.
|
332 |
|
333 |
|
334 |
### Hallucination Robustness
|
@@ -346,8 +346,8 @@ Number of characters per minute on [MUSAN](https://www.openslr.org/17) \[16] 48
|
|
346 |
|
347 |
| **Dataset** | **WER ↓** |
|
348 |
| ----------------------- | --------- |
|
349 |
-
| Earnings-22 |
|
350 |
-
| This American Life |
|
351 |
|
352 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
353 |
|
|
|
42 |
|
43 |
### Automatic Speech Recognition (ASR)
|
44 |
|
45 |
+

|
46 |
|
47 |
*Figure 1: ASR WER comparison across different models. This does not include Punctuation and Capitalisation errors.*
|
48 |
|
|
|
52 |
|
53 |
#### X → English
|
54 |
|
55 |
+

|
56 |
|
57 |
*Figure 2: AST X → En COMET scores comparison across different models*
|
58 |
|
59 |
#### English → X
|
60 |
|
61 |
|
62 |
+

|
63 |
|
64 |
*Figure 3: AST En → X COMET scores comparison across different models*
|
65 |
|
|
|
283 |
|
284 |
| **WER ↓** | Fleurs-25 Langs | CoVoST-13 Langs | MLS - 6 Langs |
|
285 |
| --------------- | -------------------- | -------------------- | ------------------ |
|
286 |
+
| **`Canary-1b-v2`** | 8.40% | 8.85% | 7.27% |
|
287 |
|
288 |
|
289 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
|
|
326 |
Performance across different Signal-to-Noise Ratios (SNR) using MUSAN music and noise samples \[16] on the [LibriSpeech Clean test set](https://www.openslr.org/12).
|
327 |
**Metric**: Word Error Rate (**WER**)
|
328 |
|
329 |
+
| **SNR (dB)** | 100 | 10 | 5 | 0 | -5 |
|
330 |
+
| --------------- | ----- | ----- | ----- | ----- | ----- |
|
331 |
+
| **`Canary-1b-v2`** | 2.18% | 2.29% | 2.80% | 5.08% | 19.38% |
|
332 |
|
333 |
|
334 |
### Hallucination Robustness
|
|
|
346 |
|
347 |
| **Dataset** | **WER ↓** |
|
348 |
| ----------------------- | --------- |
|
349 |
+
| Earnings-22 | 13.51% |
|
350 |
+
| This American Life | 8.65% |
|
351 |
|
352 |
**Note:** Presented WERs do not include Punctuation and Capitalization errors.
|
353 |
|
plots/asr.png
ADDED
![]() |
Git LFS Details
|
plots/en_x.png
ADDED
![]() |
Git LFS Details
|
plots/x_en.png
ADDED
![]() |
Git LFS Details
|