projecte-aina
/

matxa-tts-cat-multispeaker

acoustic modelling

Model card Files Files and versions Community

AlexK-PL commited on Mar 29, 2024

Commit

f08fdc4

·

verified ·

1 Parent(s): b780397

Update README.md

Files changed (1) hide show

README.md +5 -4

README.md CHANGED Viewed

@@ -33,10 +33,11 @@ datasets:
 ## Model description
-Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder side is inspired by previous works (Grad-TTS and Glow-TTS)
-modelling alignment with Monotonic Alignment Search (MOS). The decoder is essentially a U-Net inspired by Grad-TTS based on Transformers architecture combined with 1D CNNs,
-making a high reduction on memory consumption while increasing synthesis speed. Matcha-TTS is probabilistic, non-autorregressive and is trained using optimal-transport
-conditional flow matching (OT-CFM). This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching.
 ## Intended uses and limitations

 ## Model description
+Matcha-TTS is an encoder-decoder architecture designed for fast acoustic modelling in TTS. The encoder predicts phoneme durations and its mean feature vectors
+modelling alignment with Monotonic Alignment Search (MOS). And the decoder is essentially a U-Net inspired by Grad-TTS, that is based on Transformers architecture combined
+with 1D instead of 2D CNNs, making a high reduction on memory consumption and speedy synthesis.
+Matcha-TTS is non-autorregressive and is trained using optimal-transport conditional flow matching (OT-CFM).
+This yields an ODE-based decoder capable of high output quality in fewer synthesis steps than models trained using score matching.
 ## Intended uses and limitations