projecte-aina
/

matxa-tts-cat-multispeaker

acoustic modelling

Model card Files Files and versions

AlexK-PL commited on Mar 28, 2024

Commit

e3c6df5

·

verified ·

1 Parent(s): 4522849

Update README.md

Files changed (1) hide show

README.md +103 -1

README.md CHANGED Viewed

@@ -1,3 +1,105 @@
 ---
-license: apache-2.0
 ---

 ---
+language:
+- ca
+licence:
+- apache-2.0
+tags:
+- matcha TTS
+- speech
+- text-to-speech
+- catalan
+pipeline_tag: text-to-speech
+datasets:
+- projecte-aina/CATalog
 ---
+# Matcha TTS Catalan
+## Table of Contents
+<details>
+<summary>Click to expand</summary>
+- [Model description](#model-description)
+- [Intended uses and limitations](#intended-uses-and-limitations)
+- [How to use](#how-to-use)
+- [Limitations and bias](#limitations-and-bias)
+- [Training](#training)
+- [Evaluation](#evaluation)
+- [Additional information](#additional-information)
+</details>
+## Model description
+## Intended uses and limitations
+## How to use
+```python
+import torch
+from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
+input_text = "Sovint em trobo pensant en tot allò que"
+model_id  = "projecte-aina/FLOR-6.3B"
+tokenizer = AutoTokenizer.from_pretrained(model_id)
+generator = pipeline(
+    "text-generation",
+    model=model_id,
+    tokenizer=tokenizer,
+    torch_dtype=torch.bfloat16,
+    trust_remote_code=True,
+    device_map="auto",
+)
+generation = generator(
+    input_text,
+    do_sample=True,
+    top_k=10,
+    eos_token_id=tokenizer.eos_token_id,
+)
+print(f"Result: {generation[0]['generated_text']}")
+```
+## Limitations and bias
+At the time of submission, no measures have been taken to estimate the bias and toxicity embedded in the model.
+However, we are well aware that our models may be biased since the corpora have been collected using crawling techniques
+on multiple web sources. We intend to conduct research in these areas in the future, and if completed, this model card will be updated.
+## Training
+### Adaptation
+### Training data
+### Languages
+Data comes from two different datasets: festcat and openslr69
+### Framework
+## Evaluation
+### Results
+## Additional information
+### Author
+The Language Technologies Unit from Barcelona Supercomputing Center.
+### Contact
+For further information, please send an email to <[email protected]>.
+### Copyright
+Copyright(c) 2023 by Language Technologies Unit, Barcelona Supercomputing Center.
+### License
+[Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0)
+### Funding
+This work was funded by [Departament de la Vicepresidència i de Polítiques Digitals i Territori de la Generalitat de Catalunya](https://politiquesdigitals.gencat.cat/ca/inici/index.html#googtrans(ca|en) within the framework of [Projecte AINA](https://politiquesdigitals.gencat.cat/ca/economia/catalonia-ai/aina).
+### Disclaimer