Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,8 @@ datasets:
|
|
19 |
|
20 |
This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
|
21 |
|
22 |
-
The pre-trained model take as input
|
23 |
-
To generate the
|
24 |
|
25 |
## Install SpeechBrain
|
26 |
|
|
|
19 |
|
20 |
This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
|
21 |
|
22 |
+
The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
|
23 |
+
To generate the discrete self-supervised representations, we employ a K-means clustering model trained on `microsoft/wavlm-large` hidden layers, with k=1000.
|
24 |
|
25 |
## Install SpeechBrain
|
26 |
|