nielsr HF Staff commited on
Commit
25f324a
·
verified ·
1 Parent(s): 752d149

Update pipeline tag and add library name

Browse files

This PR improves the model card for `ufal/byt5-small-geccc-mate` by making two key updates to its metadata:

1. **Corrected `pipeline_tag`**: The model performs Grammar Error Correction (GEC), which is a text-to-text generation task. The `pipeline_tag` has been updated to `text-generation` to accurately reflect its functionality. This will ensure users can find this model when browsing text generation models on the Hugging Face Hub: https://huggingface.co/models?pipeline_tag=text-generation.

2. **Added `library_name`**: The model is compatible with the `transformers` library, as demonstrated by the provided usage example in the model card. Adding `library_name: transformers` will enable the automatic "How to use" widget on the model page, providing users with an immediate, executable code snippet for easy adoption.

No changes were made to the content of the model card as it is already well-documented, includes a link to the paper, and a sample usage.

Files changed (1) hide show
  1. README.md +17 -15
README.md CHANGED
@@ -1,11 +1,13 @@
1
  ---
 
2
  language: cs
3
  license: cc-by-nc-sa-4.0
4
  tags:
5
  - Czech
6
  - GEC
7
  - GECCC dataset
8
- base_model: google/byt5-small
 
9
  ---
10
 
11
  # Model Card for byt5-small-geccc-mate
@@ -18,20 +20,20 @@ the MATE method and the [GECCC dataset](https://hdl.handle.net/11234/1-4861).
18
 
19
  ## Model Description
20
 
21
- - **Developed by:** [Seznam.cz](https://seznam.cz) and [Charles University, MFF, ÚFAL](https://ufal.mff.cuni.cz/)
22
- - **Language(s) (NLP):** Czech
23
- - **Model type:** character-based encoder-decoder Transformer model
24
- - **Finetuned from model:** `google/byt5-small`
25
- - **Finetuned on:**
26
- - first synthetic errors generated by the MATE method (see [the paper](https://arxiv.org/abs/2506.22402))
27
- - then the [GECCC dataset](https://hdl.handle.net/11234/1-4861)
28
- - **License:** CC BY-NC-SA 4.0
29
 
30
  ## Model Sources
31
 
32
- - **Repository:** https://github.com/ufal/tsd2025-gec
33
- - **Paper:** [Refining Czech GEC: Insights from a Multi-Experiment Approach](https://arxiv.org/abs/2506.22402)
34
- - **Dataset:** [GECCC dataset](https://hdl.handle.net/11234/1-4861)
35
 
36
  ## Evaluation
37
 
@@ -69,8 +71,8 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
69
 
70
  ```
71
  @InProceedings{10.1007/978-3-032-02551-7_7,
72
- author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and N{\'a}plava, Jakub",
73
- editor="Ek{\v{s}}tein, Kamil and Konop{\'i}k, Miloslav and Pra{\v{z}}{\'a}k, Ond{\v{r}}ej and P{\'a}rtl, Franti{\v{s}}ek",
74
  title="Refining Czech GEC: Insights from a Multi-experiment Approach",
75
  booktitle="Text, Speech, and Dialogue",
76
  year="2026",
@@ -80,4 +82,4 @@ print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
80
  isbn="978-3-032-02551-7",
81
  doi="10.1007/978-3-032-02551-7_7"
82
  }
83
- ```
 
1
  ---
2
+ base_model: google/byt5-small
3
  language: cs
4
  license: cc-by-nc-sa-4.0
5
  tags:
6
  - Czech
7
  - GEC
8
  - GECCC dataset
9
+ pipeline_tag: text-generation
10
+ library_name: transformers
11
  ---
12
 
13
  # Model Card for byt5-small-geccc-mate
 
20
 
21
  ## Model Description
22
 
23
+ - **Developed by:** [Seznam.cz](https://seznam.cz) and [Charles University, MFF, ÚFAL](https://ufal.mff.cuni.cz/)
24
+ - **Language(s) (NLP):** Czech
25
+ - **Model type:** character-based encoder-decoder Transformer model
26
+ - **Finetuned from model:** `google/byt5-small`
27
+ - **Finetuned on:**
28
+ - first synthetic errors generated by the MATE method (see [the paper](https://arxiv.org/abs/2506.22402))
29
+ - then the [GECCC dataset](https://hdl.handle.net/11234/1-4861)
30
+ - **License:** CC BY-NC-SA 4.0
31
 
32
  ## Model Sources
33
 
34
+ - **Repository:** https://github.com/ufal/tsd2025-gec
35
+ - **Paper:** [Refining Czech GEC: Insights from a Multi-Experiment Approach](https://arxiv.org/abs/2506.22402)
36
+ - **Dataset:** [GECCC dataset](https://hdl.handle.net/11234/1-4861)
37
 
38
  ## Evaluation
39
 
 
71
 
72
  ```
73
  @InProceedings{10.1007/978-3-032-02551-7_7,
74
+ author="Pechman, Petr and Straka, Milan and Strakov{\'a}, Jana and Náplava, Jakub",
75
+ editor="Ek{\v{s}}tein, Kamil and Konopík, Miloslav and Pražák, Ondřej and Pártl, František",
76
  title="Refining Czech GEC: Insights from a Multi-experiment Approach",
77
  booktitle="Text, Speech, and Dialogue",
78
  year="2026",
 
82
  isbn="978-3-032-02551-7",
83
  doi="10.1007/978-3-032-02551-7_7"
84
  }
85
+ ```