|
|
--- |
|
|
base_model: |
|
|
- CohereLabs/aya-101 |
|
|
library_name: peft |
|
|
license: cc-by-sa-4.0 |
|
|
language: |
|
|
- fi |
|
|
metrics: |
|
|
- bleu |
|
|
- bertscore |
|
|
tags: |
|
|
- text2text-generation |
|
|
- definition-modeling |
|
|
--- |
|
|
|
|
|
# ltg/aya-definition-fi-axolotl24st_dbnary |
|
|
|
|
|
This model is a version of [CohereLabs/aya-101](https://huggingface.co/CohereLabs/aya-101), fine-tuned on datasets of Finnish usage examples and definitions. |
|
|
|
|
|
It generates definitions of Finnish words in context. Its input is the usage example and the instruction question ". Mitä tarkoittaa \<target word\>?" |
|
|
|
|
|
|
|
|
## Other models |
|
|
|
|
|
### Finnish |
|
|
|
|
|
- decoder-only |
|
|
|
|
|
[Tower, axolotl24](https://huggingface.co/ltg/tower-definition-fi-axolotl24st) |
|
|
|
|
|
[Tower, axolotl24 + dbnary](https://huggingface.co/ltg/tower-definition-fi-axolotl24st_dbnary) |
|
|
|
|
|
- encoder-only |
|
|
|
|
|
[mT0-xl, axolotl24](https://huggingface.co/ltg/mt0-definition-fi-xl-axolotl24st) |
|
|
|
|
|
[mT0-xl, axolotl24 + dbnary](https://huggingface.co/ltg/mt0-definition-fi-xl-axolotl24st_dbnary) |
|
|
|
|
|
[aya-101, axolotl24](https://huggingface.co/ltg/aya-definition-fi-axolotl24st) |
|
|
|
|
|
[aya-101, axolotl24 + dbnary](https://huggingface.co/ltg/aya-definition-fi-axolotl24st_dbnary) |
|
|
|
|
|
### German |
|
|
|
|
|
- decoder-only |
|
|
|
|
|
[Tower, dbnary](https://huggingface.co/ltg/tower-definition-de-dbnary) |
|
|
|
|
|
- encoder-only |
|
|
|
|
|
[mT0-xl, dbnary](https://huggingface.co/ltg/mt0-definition-de-xl-dbnary) |
|
|
|
|
|
[aya-101, dbnary](https://huggingface.co/ltg/aya-definition-de-dbnary) |
|
|
|
|
|
### Russian |
|
|
|
|
|
- decoder-only |
|
|
|
|
|
[Tower, axolotl24](https://huggingface.co/ltg/tower-definition-ru-axolotl24st) |
|
|
|
|
|
[Tower, axolotl24 + dbnary](https://huggingface.co/ltg/tower-definition-ru-axolotl24st_dbnary) |
|
|
|
|
|
- encoder-only |
|
|
|
|
|
[mT0-xl, axolotl24](https://huggingface.co/ltg/mt0-definition-ru-xl-axolotl24st) |
|
|
|
|
|
[mT0-xl, axolotl24 + dbnary](https://huggingface.co/ltg/mt0-definition-ru-xl-axolotl24st_dbnary) |
|
|
|
|
|
[aya-101, axolotl24](https://huggingface.co/ltg/aya-definition-ru-axolotl24st) |
|
|
|
|
|
[aya-101, axolotl24 + dbnary](https://huggingface.co/ltg/aya-definition-ru-axolotl24st_dbnary) |
|
|
|
|
|
|
|
|
### Model Sources |
|
|
|
|
|
- **Repository:** [MultilingualDefGen](https://github.com/ltgoslo/MultilingualDefGen) |
|
|
- **Paper:** [accepted to EMNLP 2025 Findings](https://arxiv.org/abs/2509.26181) |
|
|
|
|
|
## Uses |
|
|
|
|
|
The model is intended for research purposes, as a source of contextualized dictionary-like lexical definitions. |
|
|
|
|
|
The fine-tuning datasets were limited to Finnish. Although the original model is multilingual, we did not evaluate its ability to generate definitions in other languages. |
|
|
|
|
|
Generated definitions can contain all sorts of biases and stereotypes, stemming from the underlying language model and raw dictionary data. |
|
|
|
|
|
### Direct Use |
|
|
|
|
|
[script to run prediction](https://github.com/ltgoslo/MultilingualDefGen/blob/main/src/modeling/encoder_decoder_predict_lumi.py) |
|
|
|
|
|
## Training Details |
|
|
|
|
|
### Training Data |
|
|
|
|
|
[axolotl24](https://github.com/ltgoslo/axolotl24_shared_task/tree/main/data/finnish) |
|
|
|
|
|
[dbnary](https://kaiko.getalp.org/about-dbnary/) |
|
|
|
|
|
### Training Procedure |
|
|
|
|
|
[script to run training](https://github.com/ltgoslo/MultilingualDefGen/blob/main/src/modeling/encoder_decoder_finetuning.py) |
|
|
|
|
|
## Evaluation |
|
|
|
|
|
<!-- This section describes the evaluation protocols and provides the results. --> |
|
|
|
|
|
[run evaluation](https://github.com/ltgoslo/MultilingualDefGen/blob/main/src/evaluate.sh) |
|
|
|
|
|
#### Testing Data |
|
|
|
|
|
<!-- This should link to a Dataset Card if possible. --> |
|
|
|
|
|
[axolotl24 Finnish test set](https://github.com/ltgoslo/axolotl24_shared_task/blob/main/data/finnish/axolotl.test.fi.gold.tsv) |
|
|
|
|
|
#### Metrics |
|
|
|
|
|
<!-- These are the evaluation metrics being used, ideally with a description of why. --> |
|
|
|
|
|
BLEU, BERTScore |
|
|
|
|
|
## Citation |
|
|
|
|
|
<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. --> |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
``` |
|
|
@misc{fedorova2025explainingnovelsensesusing, |
|
|
title={Explaining novel senses using definition generation with open language models}, |
|
|
author={Mariia Fedorova and Andrey Kutuzov and Francesco Periti and Yves Scherrer}, |
|
|
year={2025}, |
|
|
eprint={2509.26181}, |
|
|
archivePrefix={arXiv}, |
|
|
primaryClass={cs.CL}, |
|
|
url={https://arxiv.org/abs/2509.26181}, |
|
|
} |
|
|
``` |
|
|
|
|
|
### Framework versions |
|
|
|
|
|
``` |
|
|
bert-score==0.3.13 |
|
|
peft==0.14.0 |
|
|
sentencepiece==0.2.0 |
|
|
tokenizers==0.20.1 |
|
|
torch==2.2.2 |
|
|
transformers==4.46.1 |
|
|
trl==0.15.2 |
|
|
``` |