MahtaFetrat
/

Homo-GE2PE-Persian

@@ -1,4 +1,8 @@
 ---
 license: mit
 tags:
 - g2p
@@ -10,17 +14,15 @@ tags:
 - farsi
 - phonemization
 - homograph-disambiguation
-datasets:
-- MahtaFetrat/HomoRich-G2P-Persian
-language:
-- fa
 ---
 # Homo-GE2PE: Persian Grapheme-to-Phoneme Conversion with Homograph Disambiguation
 ![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Model-orange)
-**Homo-GE2PE** is a Persian grapheme-to-phoneme (G2P) model specialized in homograph disambiguation—words with identical spellings but context-dependent pronunciations (e.g., *مرد* pronounced as *mard* "man" or *mord* "died"). Introduced in *[Fast, Not Fancy: Rethinking G2P with Rich Data and Rule-Based Models](link)*, the model extends **GE2PE** by fine-tuning it on the **HomoRich** dataset, explicitly designed for such pronunciation challenges.
 ---
@@ -132,4 +134,4 @@ Contributions and pull requests are welcome. Please open an issue to discuss the
 * [Base GE2PE Model](https://github.com/Sharif-SLPL/GE2PE)
 * [HomoRich Dataset (Huggingface)](https://huggingface.co/datasets/MahtaFetrat/HomoRich-G2P-Persian)
 * [HomoRich Dataset (Github)](https://github.com/MahtaFetrat/HomoRich-G2P-Persian)
-* [SentenceBench Persian G2P Benchmark](https://huggingface.co/datasets/MahtaFetrat/SentenceBench)

 ---
+datasets:
+- MahtaFetrat/HomoRich-G2P-Persian
+language:
+- fa
 license: mit
 tags:
 - g2p
 - farsi
 - phonemization
 - homograph-disambiguation
+library_name: transformers
+pipeline_tag: text-to-speech
 ---
 # Homo-GE2PE: Persian Grapheme-to-Phoneme Conversion with Homograph Disambiguation
 ![Hugging Face](https://img.shields.io/badge/Hugging%20Face-Model-orange)
+**Homo-GE2PE** is a Persian grapheme-to-phoneme (G2P) model specialized in homograph disambiguation—words with identical spellings but context-dependent pronunciations (e.g., *مرد* pronounced as *mard* "man" or *mord* "died"). Introduced in *[Fast, Not Fancy: Rethinking G2P with Rich Data and Rule-Based Models](https://huggingface.co/papers/2505.12973)*, the model extends **GE2PE** by fine-tuning it on the **HomoRich** dataset, explicitly designed for such pronunciation challenges.
 ---
 * [Base GE2PE Model](https://github.com/Sharif-SLPL/GE2PE)
 * [HomoRich Dataset (Huggingface)](https://huggingface.co/datasets/MahtaFetrat/HomoRich-G2P-Persian)
 * [HomoRich Dataset (Github)](https://github.com/MahtaFetrat/HomoRich-G2P-Persian)
+* [SentenceBench Persian G2P Benchmark](https://huggingface.co/datasets/MahtaFetrat/SentenceBench)