projecte-aina
/

matxa-tts-cat-multispeaker

acoustic modelling

Model card Files Files and versions Community

AlexK-PL commited on Mar 29, 2024

Commit

b9d9e1d

·

verified ·

1 Parent(s): c30651e

Update README.md

Files changed (1) hide show

README.md +8 -10

README.md CHANGED Viewed

@@ -23,9 +23,9 @@ datasets:
 - [Model description](#model-description)
 - [Intended uses and limitations](#intended-uses-and-limitations)
 - [How to use](#how-to-use)
-- [Limitations and bias](#limitations-and-bias)
 - [Training](#training)
 - [Evaluation](#evaluation)
 - [Additional information](#additional-information)
 </details>
@@ -41,6 +41,13 @@ This yields an ODE-based decoder capable of high output quality in fewer synthes
 ## Intended uses and limitations
 ## How to use
 ```python
 import torch
@@ -68,15 +75,6 @@ generation = generator(
 print(f"Result: {generation[0]['generated_text']}")
 ```
-## Limitations and bias
-This model is intended to serve as an acoustic feature generator for multispeaker text-to-speech systems for the Catalan language.
-It has been finetuned using a Catalan phonemizer, therefore if the model is used in other languages it may will not produce intelligible samples after converting its output
-into a speech waveform.
-The quality of the samples can vary depending on the speaker.
-This may be due to the sensitivity of the model in learning specific frequencies and also due to the samples used for each speaker.
 ## Training
 ### Adaptation

 - [Model description](#model-description)
 - [Intended uses and limitations](#intended-uses-and-limitations)
 - [How to use](#how-to-use)
 - [Training](#training)
 - [Evaluation](#evaluation)
+- [Citation](#citation)
 - [Additional information](#additional-information)
 </details>
 ## Intended uses and limitations
+This model is intended to serve as an acoustic feature generator for multispeaker text-to-speech systems for the Catalan language.
+It has been finetuned using a Catalan phonemizer, therefore if the model is used in other languages it may will not produce intelligible samples after converting its output
+into a speech waveform.
+The quality of the samples can vary depending on the speaker.
+This may be due to the sensitivity of the model in learning specific frequencies and also due to the samples used for each speaker.
 ## How to use
 ```python
 import torch
 print(f"Result: {generation[0]['generated_text']}")
 ```
 ## Training
 ### Adaptation