Update README.md
Browse files
README.md
CHANGED
@@ -16,25 +16,25 @@ Deployment Geography: Global <br>
|
|
16 |
Use Case: Developers, speech processing engineers, and AI researchers will use it as the first step for other speech processing models. <br>
|
17 |
|
18 |
|
19 |
-
##
|
20 |
[1] Jia, Fei, Somshubra Majumdar, and Boris Ginsburg. "MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. <br>
|
21 |
[2] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
22 |
<br>
|
23 |
|
24 |
-
## Model Architecture
|
25 |
|
26 |
**Architecture Type:** Convolutional Neural Network (CNN) <br>
|
27 |
**Network Architecture:** MarbleNet <br>
|
28 |
|
29 |
**This model has 91.5K of model parameters** <br>
|
30 |
|
31 |
-
|
32 |
**Input Type(s):** Audio <br>
|
33 |
**Input Format:** .wav files <br>
|
34 |
**Input Parameters:** 1D <br>
|
35 |
**Other Properties Related to Input:** 16000 Hz Mono-channel Audio, Pre-Processing Not Needed <br>
|
36 |
|
37 |
-
|
38 |
**Output Type(s):** Sequence of speech probabilities for each 20 millisecond frame <br>
|
39 |
**Output Format:** Float Array <br>
|
40 |
**Output Parameters:** 1D <br>
|
@@ -52,7 +52,6 @@ TODO
|
|
52 |
**Runtime Engine(s):**
|
53 |
* NeMo-2.0.0 <br>
|
54 |
|
55 |
-
|
56 |
**Supported Hardware Microarchitecture Compatibility:** <br>
|
57 |
* [NVIDIA Ampere] <br>
|
58 |
* [NVIDIA Blackwell] <br>
|
|
|
16 |
Use Case: Developers, speech processing engineers, and AI researchers will use it as the first step for other speech processing models. <br>
|
17 |
|
18 |
|
19 |
+
## References:
|
20 |
[1] Jia, Fei, Somshubra Majumdar, and Boris Ginsburg. "MarbleNet: Deep 1D Time-Channel Separable Convolutional Neural Network for Voice Activity Detection." ICASSP 2021-2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 2021. <br>
|
21 |
[2] [NVIDIA NeMo Toolkit](https://github.com/NVIDIA/NeMo)
|
22 |
<br>
|
23 |
|
24 |
+
## Model Architecture:
|
25 |
|
26 |
**Architecture Type:** Convolutional Neural Network (CNN) <br>
|
27 |
**Network Architecture:** MarbleNet <br>
|
28 |
|
29 |
**This model has 91.5K of model parameters** <br>
|
30 |
|
31 |
+
## Input: <br>
|
32 |
**Input Type(s):** Audio <br>
|
33 |
**Input Format:** .wav files <br>
|
34 |
**Input Parameters:** 1D <br>
|
35 |
**Other Properties Related to Input:** 16000 Hz Mono-channel Audio, Pre-Processing Not Needed <br>
|
36 |
|
37 |
+
## Output: <br>
|
38 |
**Output Type(s):** Sequence of speech probabilities for each 20 millisecond frame <br>
|
39 |
**Output Format:** Float Array <br>
|
40 |
**Output Parameters:** 1D <br>
|
|
|
52 |
**Runtime Engine(s):**
|
53 |
* NeMo-2.0.0 <br>
|
54 |
|
|
|
55 |
**Supported Hardware Microarchitecture Compatibility:** <br>
|
56 |
* [NVIDIA Ampere] <br>
|
57 |
* [NVIDIA Blackwell] <br>
|