Model Card for Model ID

This model is part of a series of models trained for the ML4AL paper “Gotta catch ‘em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge", written in the context of the KU Leuven ID-N project NIKAW (Networks of Ideas and Knowledge in the Ancient World)

Model Details

Model Description

Developed by: Marijke Beersmans & Alek Keersmaekers
Model type: XLMRobertaForTokenClassification, finetuned for NER (PERS, LOC, GRP).
Language(s) (NLP): Ancient Greek (NFKC normalization)
Finetuned from model: UGARIT/grc-alignment

Model Sources

Repository: NERAncientGreekML4AL GitHub (for data and training scripts)
Paper: ML4AL paper

Training Details

Training Data

Repository: NERAncientGreekML4AL GitHub

We thank the following projects for providing the training data:

Training Hyperparameters

We use Weights & Biases for hyperparameter optimization with a random search strategy (10 folds), aiming to maximize the evaluation F1 score (eval_f1).

The search space includes:

Learning Rate: Sampled uniformly between 1e-6 and 1e-4
Weight Decay: One of [0.1, 0.01, 0.001]
Number of Training Epochs: One of [3, 4, 5, 6]

For the final training of this model, the hyperparameters were:

Learning Rate: 5.784084017961986e-05
Weight Decay: 0.01
Number of Training Epochs: 5

Evaluation

This models was evaluated on precision, recall and macro-f1 for its entity classes. See the paper for more information.

Label	precision	recall	f1-score	support
GRP	0.7838	0.8801	0.8291	1384
LOC	0.7073	0.7502	0.7282	1105
PERS	0.8228	0.9107	0.8645	3090
micro avg	0.7909	0.8713	0.8292	5579
macro avg	0.7713	0.847	0.8073	5579
weighted avg	0.7903	0.8713	0.8287	5579

If you use this work, please cite the following paper:

APA:

Beersmans, M., Keersmaekers, A., de Graaf, E., Van de Cruys, T., Depauw, M., & Fantoli, M. (2024). “Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge. In J. Pavlopoulos, T. Sommerschield, Y. Assael, S. Gordin, K. Cho, M. Passarotti, R. Sprugnoli, Y. Liu, B. Li, & A. Anderson (Eds.), Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024) (pp. 152–164). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.ml4al-1.16

BibTeX

@inproceedings{Beersmans_Keersmaekers_de Graaf_Van de Cruys_Depauw_Fantoli_2024,
  address = {Hybrid in Bangkok, Thailand and online},
  title = {“Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge},
  url = {https://aclanthology.org/2024.ml4al-1.16},
  DOI = {10.18653/v1/2024.ml4al-1.16},
  abstractNote = {In this paper, we present a study of transformer-based Named Entity Recognition (NER) as applied to Ancient Greek texts, with an emphasis on retrieving personal names. Recent research shows that, while the task remains difficult, the use of transformer models results in significant improvements. We, therefore, compare the performance of four transformer models on the task of NER for the categories of people, locations and groups, and add an out-of-domain test set to the existing datasets. Results on this set highlight the shortcomings of the models when confronted with a random sample of sentences. To be able to more straightforwardly integrate domain and linguistic knowledge to improve performance, we narrow down our approach to the category of people. The task is simplified to a binary PERS/MISC classification on the token level, starting from capitalised words. Next, we test the use of domain and linguistic knowledge to improve the results. We find that including simple gazetteer information as a binary mask has a marginally positive effect on newly annotated data and that treebanks can be used to help identify multi-word individuals if they are scarcely or inconsistently annotated in the available training data. The qualitative error analysis identifies the potential for improvement in both manual annotation and the inclusion of domain and linguistic knowledge in the transformer models.},
  booktitle = {Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)},
  publisher = {Association for Computational Linguistics},
  author = {Beersmans, Marijke and Keersmaekers, Alek and de Graaf, Evelien and Van de Cruys, Tim and Depauw, Mark and Fantoli, Margherita},
  editor = {Pavlopoulos, John and Sommerschield, Thea and Assael, Yannis and Gordin, Shai and Cho, Kyunghyun and Passarotti, Marco and Sprugnoli, Rachele and Liu, Yudong and Li, Bin and Anderson, Adam},
  year = {2024},
  month = aug,
  pages = {152--164}
}

Marijke
/

UGARIT_hypopt_NER