Update pipeline tag and add library name
Browse filesThis PR improves the model card for `ufal/byt5-small-geccc-mate` by making two key updates to its metadata:
1. **Corrected `pipeline_tag`**: The model performs Grammar Error Correction (GEC), which is a text-to-text generation task. The `pipeline_tag` has been updated to `text-generation` to accurately reflect its functionality. This will ensure users can find this model when browsing text generation models on the Hugging Face Hub: https://huggingface.co/models?pipeline_tag=text-generation.
2. **Added `library_name`**: The model is compatible with the `transformers` library, as demonstrated by the provided usage example in the model card. Adding `library_name: transformers` will enable the automatic "How to use" widget on the model page, providing users with an immediate, executable code snippet for easy adoption.
No changes were made to the content of the model card as it is already well-documented, includes a link to the paper, and a sample usage.
@@ -1,11 +1,13 @@
|
|
1 |
---
|
|
|
2 |
language: cs
|
3 |
license: cc-by-nc-sa-4.0
|
4 |
tags:
|
5 |
- Czech
|
6 |
- GEC
|
7 |
- GECCC dataset
|
8 |
-
|
|
|
9 |
---
|
10 |
|
11 |
# Model Card for byt5-small-geccc-mate
|
@@ -18,20 +20,20 @@ the MATE method and the [GECCC dataset](https://hdl.handle.net/11234/1-4861).
|
|
18 |
|
19 |
## Model Description
|
20 |
|
21 |
-
-
|
22 |
-
-
|
23 |
-
-
|
24 |
-
-
|
25 |
-
-
|
26 |
-
|
27 |
-
|
28 |
-
-
|
29 |
|
30 |
## Model Sources
|
31 |
|
32 |
-
-
|
33 |
-
-
|
34 |
-
-
|
35 |
|
36 |
## Evaluation
|
37 |
|
@@ -69,8 +71,8 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
|
|
69 |
|
70 |
```
|
71 |
@InProceedings{10.1007/978-3-032-02551-7_7,
|
72 |
-
author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and
|
73 |
-
editor="Ek{\v{s}}tein, Kamil and
|
74 |
title="Refining Czech GEC: Insights from a Multi-experiment Approach",
|
75 |
booktitle="Text, Speech, and Dialogue",
|
76 |
year="2026",
|
@@ -80,4 +82,4 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
|
|
80 |
isbn="978-3-032-02551-7",
|
81 |
doi="10.1007/978-3-032-02551-7_7"
|
82 |
}
|
83 |
-
```
|
|
|
1 |
---
|
2 |
+
base_model: google/byt5-small
|
3 |
language: cs
|
4 |
license: cc-by-nc-sa-4.0
|
5 |
tags:
|
6 |
- Czech
|
7 |
- GEC
|
8 |
- GECCC dataset
|
9 |
+
pipeline_tag: text-generation
|
10 |
+
library_name: transformers
|
11 |
---
|
12 |
|
13 |
# Model Card for byt5-small-geccc-mate
|
|
|
20 |
|
21 |
## Model Description
|
22 |
|
23 |
+
- **Developed by:** [Seznam.cz](https://seznam.cz) and [Charles University, MFF, ÚFAL](https://ufal.mff.cuni.cz/)
|
24 |
+
- **Language(s) (NLP):** Czech
|
25 |
+
- **Model type:** character-based encoder-decoder Transformer model
|
26 |
+
- **Finetuned from model:** `google/byt5-small`
|
27 |
+
- **Finetuned on:**
|
28 |
+
- first synthetic errors generated by the MATE method (see [the paper](https://arxiv.org/abs/2506.22402))
|
29 |
+
- then the [GECCC dataset](https://hdl.handle.net/11234/1-4861)
|
30 |
+
- **License:** CC BY-NC-SA 4.0
|
31 |
|
32 |
## Model Sources
|
33 |
|
34 |
+
- **Repository:** https://github.com/ufal/tsd2025-gec
|
35 |
+
- **Paper:** [Refining Czech GEC: Insights from a Multi-Experiment Approach](https://arxiv.org/abs/2506.22402)
|
36 |
+
- **Dataset:** [GECCC dataset](https://hdl.handle.net/11234/1-4861)
|
37 |
|
38 |
## Evaluation
|
39 |
|
|
|
71 |
|
72 |
```
|
73 |
@InProceedings{10.1007/978-3-032-02551-7_7,
|
74 |
+
author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and Náplava, Jakub",
|
75 |
+
editor="Ek{\v{s}}tein, Kamil and Konopík, Miloslav and Pražák, Ondřej and Pártl, František",
|
76 |
title="Refining Czech GEC: Insights from a Multi-experiment Approach",
|
77 |
booktitle="Text, Speech, and Dialogue",
|
78 |
year="2026",
|
|
|
82 |
isbn="978-3-032-02551-7",
|
83 |
doi="10.1007/978-3-032-02551-7_7"
|
84 |
}
|
85 |
+
```
|