speechbrain
/

hifigan-wavlm-l1-3-7-12-18-23-k1000-LibriTTS

speech-synthesis

Model card Files Files and versions Community

chaanks commited on Jul 17

Commit

dae100a

•

1 Parent(s): 5d9a5b0

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -19,8 +19,8 @@ datasets:
 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
-The pre-trained model take as input continous self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
-To generate the continuous self-supervised representations, we use `microsoft/wavlm-large`.
 ## Install SpeechBrain

 This repository provides all the necessary tools for using a [scalable HiFiGAN Unit](https://arxiv.org/abs/2406.10735) vocoder trained with [LibriTTS](https://www.openslr.org/141/).
+The pre-trained model take as input discrete self-supervised representations and produces a waveform as output. This is suitable for a wide range of generative tasks such as speech enhancement, separation, text-to-speech, voice cloning, etc. Please read [DASB - Discrete Audio and Speech Benchmark](https://arxiv.org/abs/2406.14294) for more information.
+To generate the discrete self-supervised representations, we employ a K-means clustering model trained on `microsoft/wavlm-large` hidden layers, with k=1000.
 ## Install SpeechBrain