Model Card for Model ID
This model is part of a series of models trained for the ML4AL paper “Gotta catch ‘em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge", written in the context of the KU Leuven ID-N project NIKAW (Networks of Ideas and Knowledge in the Ancient World)
Model Details
Model Description
- Developed by: Marijke Beersmans & Alek Keersmaekers
- Model type: XLMRobertaForTokenClassification, finetuned for NER (PERS, LOC, GRP).
- Language(s) (NLP): Ancient Greek (NFKC normalization)
- Finetuned from model: UGARIT/grc-alignment
Model Sources
- Repository: NERAncientGreekML4AL GitHub (for data and training scripts)
- Paper: ML4AL paper
Training Details
Training Data
Repository: NERAncientGreekML4AL GitHub
We thank the following projects for providing the training data:
- Digital Periegesis
- Josh Kemp, annotated Odyssey
- The Stepbible project
- Perseus Digital Library, Deipnosophistae
Training Hyperparameters
We use Weights & Biases for hyperparameter optimization with a random search strategy (10 folds), aiming to maximize the evaluation F1 score (eval_f1).
The search space includes:
- Learning Rate: Sampled uniformly between 1e-6 and 1e-4
- Weight Decay: One of [0.1, 0.01, 0.001]
- Number of Training Epochs: One of [3, 4, 5, 6]
For the final training of this model, the hyperparameters were:
- Learning Rate: 5.784084017961986e-05
- Weight Decay: 0.01
- Number of Training Epochs: 5
Evaluation
This models was evaluated on precision, recall and macro-f1 for its entity classes. See the paper for more information.
Label | precision | recall | f1-score | support |
---|---|---|---|---|
GRP | 0.7838 | 0.8801 | 0.8291 | 1384 |
LOC | 0.7073 | 0.7502 | 0.7282 | 1105 |
PERS | 0.8228 | 0.9107 | 0.8645 | 3090 |
micro avg | 0.7909 | 0.8713 | 0.8292 | 5579 |
macro avg | 0.7713 | 0.847 | 0.8073 | 5579 |
weighted avg | 0.7903 | 0.8713 | 0.8287 | 5579 |
If you use this work, please cite the following paper:
APA:
Beersmans, M., Keersmaekers, A., de Graaf, E., Van de Cruys, T., Depauw, M., & Fantoli, M. (2024). “Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge. In J. Pavlopoulos, T. Sommerschield, Y. Assael, S. Gordin, K. Cho, M. Passarotti, R. Sprugnoli, Y. Liu, B. Li, & A. Anderson (Eds.), Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024) (pp. 152–164). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.ml4al-1.16
BibTeX
@inproceedings{Beersmans_Keersmaekers_de Graaf_Van de Cruys_Depauw_Fantoli_2024,
address = {Hybrid in Bangkok, Thailand and online},
title = {“Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge},
url = {https://aclanthology.org/2024.ml4al-1.16},
DOI = {10.18653/v1/2024.ml4al-1.16},
abstractNote = {In this paper, we present a study of transformer-based Named Entity Recognition (NER) as applied to Ancient Greek texts, with an emphasis on retrieving personal names. Recent research shows that, while the task remains difficult, the use of transformer models results in significant improvements. We, therefore, compare the performance of four transformer models on the task of NER for the categories of people, locations and groups, and add an out-of-domain test set to the existing datasets. Results on this set highlight the shortcomings of the models when confronted with a random sample of sentences. To be able to more straightforwardly integrate domain and linguistic knowledge to improve performance, we narrow down our approach to the category of people. The task is simplified to a binary PERS/MISC classification on the token level, starting from capitalised words. Next, we test the use of domain and linguistic knowledge to improve the results. We find that including simple gazetteer information as a binary mask has a marginally positive effect on newly annotated data and that treebanks can be used to help identify multi-word individuals if they are scarcely or inconsistently annotated in the available training data. The qualitative error analysis identifies the potential for improvement in both manual annotation and the inclusion of domain and linguistic knowledge in the transformer models.},
booktitle = {Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)},
publisher = {Association for Computational Linguistics},
author = {Beersmans, Marijke and Keersmaekers, Alek and de Graaf, Evelien and Van de Cruys, Tim and Depauw, Mark and Fantoli, Margherita},
editor = {Pavlopoulos, John and Sommerschield, Thea and Assael, Yannis and Gordin, Shai and Cho, Kyunghyun and Passarotti, Marco and Sprugnoli, Rachele and Liu, Yudong and Li, Bin and Anderson, Adam},
year = {2024},
month = aug,
pages = {152--164}
}
- Downloads last month
- 5
Model tree for Marijke/UGARIT_hypopt_NER
Base model
UGARIT/grc-alignment