naymaraq commited on
Commit
9e8c199
·
verified ·
1 Parent(s): b2804d8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -5
README.md CHANGED
@@ -16,25 +16,25 @@ Deployment Geography: Global <br>
16
  Use Case: Developers, speech processing engineers, and AI researchers will use it as the first step for other speech processing models. <br>
17
 
18
 
19
- ## Reference
20
  [1] Jia, Fei, Somshubra Majumdar, and Boris Ginsburg. "MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. <br>
21
  [2] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
22
  <br>
23
 
24
- ## Model Architecture
25
 
26
  **Architecture Type:** Convolutional Neural Network (CNN) <br>
27
  **Network Architecture:** MarbleNet <br>
28
 
29
  **This model has 91.5K of model parameters** <br>
30
 
31
- ### Input
32
  **Input Type(s):** Audio <br>
33
  **Input Format:** .wav files <br>
34
  **Input Parameters:** 1D <br>
35
  **Other Properties Related to Input:** 16000 Hz Mono-channel Audio, Pre-Processing Not Needed <br>
36
 
37
- ### Output:
38
  **Output Type(s):** Sequence of speech probabilities for each 20 millisecond frame <br>
39
  **Output Format:** Float Array <br>
40
  **Output Parameters:** 1D <br>
@@ -52,7 +52,6 @@ TODO
52
  **Runtime Engine(s):**
53
  * NeMo-2.0.0 <br>
54
 
55
-
56
  **Supported Hardware Microarchitecture Compatibility:** <br>
57
  * [NVIDIA Ampere] <br>
58
  * [NVIDIA Blackwell] <br>
 
16
  Use Case: Developers, speech processing engineers, and AI researchers will use it as the first step for other speech processing models. <br>
17
 
18
 
19
+ ## References:
20
  [1] Jia, Fei, Somshubra Majumdar, and Boris Ginsburg. "MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. <br>
21
  [2] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
22
  <br>
23
 
24
+ ## Model Architecture:
25
 
26
  **Architecture Type:** Convolutional Neural Network (CNN) <br>
27
  **Network Architecture:** MarbleNet <br>
28
 
29
  **This model has 91.5K of model parameters** <br>
30
 
31
+ ## Input: <br>
32
  **Input Type(s):** Audio <br>
33
  **Input Format:** .wav files <br>
34
  **Input Parameters:** 1D <br>
35
  **Other Properties Related to Input:** 16000 Hz Mono-channel Audio, Pre-Processing Not Needed <br>
36
 
37
+ ## Output: <br>
38
  **Output Type(s):** Sequence of speech probabilities for each 20 millisecond frame <br>
39
  **Output Format:** Float Array <br>
40
  **Output Parameters:** 1D <br>
 
52
  **Runtime Engine(s):**
53
  * NeMo-2.0.0 <br>
54
 
 
55
  **Supported Hardware Microarchitecture Compatibility:** <br>
56
  * [NVIDIA Ampere] <br>
57
  * [NVIDIA Blackwell] <br>