Text-to-Speech
PyTorch
ONNX
Catalan
matcha-tts
acoustic modelling
speech
multispeaker
Baybars commited on
Commit
651d66f
1 Parent(s): e96083b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -6
README.md CHANGED
@@ -14,7 +14,7 @@ datasets:
14
  - projecte-aina/openslr-slr69-ca-trimmed-denoised
15
  ---
16
 
17
- # Matcha-TTS Catalan Multispeaker
18
 
19
  ## Table of Contents
20
  <details>
@@ -32,12 +32,12 @@ datasets:
32
 
33
  ## Model Description
34
 
35
- **Matcha-TTS** is an encoder-decoder architecture designed for fast acoustic modelling in TTS.
36
  The encoder part is based on a text encoder and a phoneme duration prediction that together predict averaged acoustic features.
37
  And the decoder has essentially a U-Net backbone inspired by [Grad-TTS](https://arxiv.org/pdf/2105.06337.pdf), which is based on the Transformer architecture.
38
  In the latter, by replacing 2D CNNs by 1D CNNs, a large reduction in memory consumption and fast synthesis is achieved.
39
 
40
- **Matcha-TTS** is a non-autorregressive model trained with optimal-transport conditional flow matching (OT-CFM).
41
  This yields an ODE-based decoder capable of generating high output quality in fewer synthesis steps than models trained using score matching.
42
 
43
  ## Intended Uses and Limitations
@@ -64,7 +64,7 @@ python -m venv /path/to/venv
64
  source /path/to/venv/bin/activate
65
  ```
66
 
67
- For training and inferencing with Catalan Matcha-TTS you need to compile the provided espeak-ng with the Catalan phonemizer:
68
  ```bash
69
  git clone https://github.com/projecte-aina/espeak-ng.git
70
 
@@ -97,8 +97,8 @@ pip install -e .
97
 
98
  #### PyTorch
99
 
100
- Speech end-to-end inference can be done together with **Catalan Matcha-TTS**.
101
- Both models (Catalan Matcha-TTS and Vocos) are loaded remotely from the HF hub.
102
 
103
  First, export the following environment variables to include the installed espeak-ng version:
104
 
 
14
  - projecte-aina/openslr-slr69-ca-trimmed-denoised
15
  ---
16
 
17
+ # 🍵 Matxa-TTS Catalan Multispeaker
18
 
19
  ## Table of Contents
20
  <details>
 
32
 
33
  ## Model Description
34
 
35
+ 🍵 **Matxa-TTS** is based on **Matcha-TTS** that is an encoder-decoder architecture designed for fast acoustic modelling in TTS.
36
  The encoder part is based on a text encoder and a phoneme duration prediction that together predict averaged acoustic features.
37
  And the decoder has essentially a U-Net backbone inspired by [Grad-TTS](https://arxiv.org/pdf/2105.06337.pdf), which is based on the Transformer architecture.
38
  In the latter, by replacing 2D CNNs by 1D CNNs, a large reduction in memory consumption and fast synthesis is achieved.
39
 
40
+ **Matxa-TTS** is a non-autorregressive model trained with optimal-transport conditional flow matching (OT-CFM).
41
  This yields an ODE-based decoder capable of generating high output quality in fewer synthesis steps than models trained using score matching.
42
 
43
  ## Intended Uses and Limitations
 
64
  source /path/to/venv/bin/activate
65
  ```
66
 
67
+ For training and inferencing with Catalan Matxa-TTS you need to compile the provided espeak-ng with the Catalan phonemizer:
68
  ```bash
69
  git clone https://github.com/projecte-aina/espeak-ng.git
70
 
 
97
 
98
  #### PyTorch
99
 
100
+ Speech end-to-end inference can be done together with **Catalan Matxa-TTS**.
101
+ Both models (Catalan Matxa-TTS and alVoCat) are loaded remotely from the HF hub.
102
 
103
  First, export the following environment variables to include the installed espeak-ng version:
104