Text-to-Speech
PyTorch
ONNX
Catalan
matcha-tts
acoustic modelling
speech
multispeaker
AlexK-PL commited on
Commit
2a49114
1 Parent(s): 1517df0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -12
README.md CHANGED
@@ -30,7 +30,7 @@ datasets:
30
 
31
  </details>
32
 
33
- ## Model description
34
 
35
  **Matcha-TTS** is an encoder-decoder architecture designed for fast acoustic modelling in TTS.
36
  The encoder part is based on a text encoder and a phoneme duration prediction that together predict averaged acoustic features.
@@ -40,7 +40,7 @@ In the latter, by replacing 2D CNNs by 1D CNNs, a large reduction in memory cons
40
  **Matcha-TTS** is a non-autorregressive model trained with optimal-transport conditional flow matching (OT-CFM).
41
  This yields an ODE-based decoder capable of generating high output quality in fewer synthesis steps than models trained using score matching.
42
 
43
- ## Intended uses and limitations
44
 
45
  This model is intended to serve as an acoustic feature generator for multispeaker text-to-speech systems for the Catalan language.
46
  It has been finetuned using a Catalan phonemizer, therefore if the model is used for other languages it may will not produce intelligible samples after mapping
@@ -49,7 +49,7 @@ its output into a speech waveform.
49
  The quality of the samples can vary depending on the speaker.
50
  This may be due to the sensitivity of the model in learning specific frequencies and also due to the quality of samples for each speaker.
51
 
52
- ## How to use
53
 
54
  ### Installation
55
 
@@ -85,13 +85,9 @@ pip install git+https://github.com/langtech-bsc/Matcha-TTS.git@dev-cat
85
 
86
  ```
87
 
88
-
89
  ### Generate
90
 
91
- ## Training
92
-
93
- ### Adaptation
94
-
95
 
96
  ### Training data
97
 
@@ -102,13 +98,11 @@ The model was trained on 2 **Catalan** speech datasets
102
  | Festcat | ca | 22 |
103
  | OpenSLR69 | ca | 5 |
104
 
105
-
106
- ### Framework
107
 
108
 
109
  ## Evaluation
110
 
111
- ### Results
112
 
113
  ## Citation
114
 
@@ -125,7 +119,7 @@ If this code contributes to your research, please cite the work:
125
  }
126
  ```
127
 
128
- ## Additional information
129
 
130
  ### Author
131
  The Language Technologies Unit from Barcelona Supercomputing Center.
 
30
 
31
  </details>
32
 
33
+ ## Model Description
34
 
35
  **Matcha-TTS** is an encoder-decoder architecture designed for fast acoustic modelling in TTS.
36
  The encoder part is based on a text encoder and a phoneme duration prediction that together predict averaged acoustic features.
 
40
  **Matcha-TTS** is a non-autorregressive model trained with optimal-transport conditional flow matching (OT-CFM).
41
  This yields an ODE-based decoder capable of generating high output quality in fewer synthesis steps than models trained using score matching.
42
 
43
+ ## Intended Uses and Limitations
44
 
45
  This model is intended to serve as an acoustic feature generator for multispeaker text-to-speech systems for the Catalan language.
46
  It has been finetuned using a Catalan phonemizer, therefore if the model is used for other languages it may will not produce intelligible samples after mapping
 
49
  The quality of the samples can vary depending on the speaker.
50
  This may be due to the sensitivity of the model in learning specific frequencies and also due to the quality of samples for each speaker.
51
 
52
+ ## How to Use
53
 
54
  ### Installation
55
 
 
85
 
86
  ```
87
 
 
88
  ### Generate
89
 
90
+ ## Training Details
 
 
 
91
 
92
  ### Training data
93
 
 
98
  | Festcat | ca | 22 |
99
  | OpenSLR69 | ca | 5 |
100
 
101
+ ### Training procedure
 
102
 
103
 
104
  ## Evaluation
105
 
 
106
 
107
  ## Citation
108
 
 
119
  }
120
  ```
121
 
122
+ ## Additional Information
123
 
124
  ### Author
125
  The Language Technologies Unit from Barcelona Supercomputing Center.