Update README.md
Browse files
README.md
CHANGED
@@ -33,10 +33,11 @@ datasets:
|
|
33 |
|
34 |
## Model description
|
35 |
|
36 |
-
Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder
|
37 |
-
modelling alignment with Monotonic Alignment Search (MOS).
|
38 |
-
making a high reduction on memory consumption
|
39 |
-
|
|
|
40 |
|
41 |
## Intended uses and limitations
|
42 |
|
|
|
33 |
|
34 |
## Model description
|
35 |
|
36 |
+
Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder predicts phoneme durations and its mean feature vectors
|
37 |
+
modelling alignment with Monotonic Alignment Search (MOS). And the decoder is essentially a U-Net inspired by Grad-TTS, that is based on Transformers architecture combined
|
38 |
+
with 1D instead of 2D CNNs, making a high reduction on memory consumption and speedy synthesis.
|
39 |
+
Matcha-TTS is non-autorregressive and is trained using optimal-transport conditional flow matching (OT-CFM).
|
40 |
+
This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching.
|
41 |
|
42 |
## Intended uses and limitations
|
43 |
|