Model Card for Model ID

This model is part of a series of models trained for the ML4AL paper “Gotta catch ‘em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge", written in the context of the KU Leuven ID-N project NIKAW (Networks of Ideas and Knowledge in the Ancient World)

Model Details

Model Description

  • Developed by: Marijke Beersmans & Alek Keersmaekers
  • Model type: XLMRobertaForTokenClassification, finetuned for NER (PERS, MISC).
  • Language(s) (NLP): Ancient Greek (NFKC normalization)
  • Finetuned from model: UGARIT/grc-alignment

Model Sources

Training Details

Training Data

Repository: NERAncientGreekML4AL GitHub

We thank the following projects for providing the training data:

Training Hyperparameters

We use Weights & Biases for hyperparameter optimization with a random search strategy (10 folds), aiming to maximize the evaluation F1 score (eval_f1).

The search space includes:

  • Learning Rate: Sampled uniformly between 1e-6 and 1e-4
  • Weight Decay: One of [0.1, 0.01, 0.001]
  • Number of Training Epochs: One of [3, 4, 5, 6]

For the final training of this model, the hyperparameters were:

  • Learning Rate: 2.4904352372072748e-05
  • Weight Decay: 0.001
  • Number of Training Epochs: 6

Evaluation

This models was evaluated on precision, recall and macro-f1 for its entity classes. See the paper for more information.

Label precision recall f1-score support
MISC 0.914 0.8694 0.8912 3706
PERS 0.8699 0.9144 0.8916 3539
macro avg 0.892 0.8919 0.8914 7245
weighted avg 0.8925 0.8914 0.8914 7245

If you use this work, please cite the following paper:

APA:

Beersmans, M., Keersmaekers, A., de Graaf, E., Van de Cruys, T., Depauw, M., & Fantoli, M. (2024). “Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge. In J. Pavlopoulos, T. Sommerschield, Y. Assael, S. Gordin, K. Cho, M. Passarotti, R. Sprugnoli, Y. Liu, B. Li, & A. Anderson (Eds.), Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024) (pp. 152–164). Association for Computational Linguistics. https://doi.org/10.18653/v1/2024.ml4al-1.16

BibTeX

@inproceedings{Beersmans_Keersmaekers_de Graaf_Van de Cruys_Depauw_Fantoli_2024,
  address = {Hybrid in Bangkok, Thailand and online},
  title = {“Gotta catch `em all!”: Retrieving people in Ancient Greek texts combining transformer models and domain knowledge},
  url = {https://aclanthology.org/2024.ml4al-1.16},
  DOI = {10.18653/v1/2024.ml4al-1.16},
  abstractNote = {In this paper, we present a study of transformer-based Named Entity Recognition (NER) as applied to Ancient Greek texts, with an emphasis on retrieving personal names. Recent research shows that, while the task remains difficult, the use of transformer models results in significant improvements. We, therefore, compare the performance of four transformer models on the task of NER for the categories of people, locations and groups, and add an out-of-domain test set to the existing datasets. Results on this set highlight the shortcomings of the models when confronted with a random sample of sentences. To be able to more straightforwardly integrate domain and linguistic knowledge to improve performance, we narrow down our approach to the category of people. The task is simplified to a binary PERS/MISC classification on the token level, starting from capitalised words. Next, we test the use of domain and linguistic knowledge to improve the results. We find that including simple gazetteer information as a binary mask has a marginally positive effect on newly annotated data and that treebanks can be used to help identify multi-word individuals if they are scarcely or inconsistently annotated in the available training data. The qualitative error analysis identifies the potential for improvement in both manual annotation and the inclusion of domain and linguistic knowledge in the transformer models.},
  booktitle = {Proceedings of the 1st Workshop on Machine Learning for Ancient Languages (ML4AL 2024)},
  publisher = {Association for Computational Linguistics},
  author = {Beersmans, Marijke and Keersmaekers, Alek and de Graaf, Evelien and Van de Cruys, Tim and Depauw, Mark and Fantoli, Margherita},
  editor = {Pavlopoulos, John and Sommerschield, Thea and Assael, Yannis and Gordin, Shai and Cho, Kyunghyun and Passarotti, Marco and Sprugnoli, Rachele and Liu, Yudong and Li, Bin and Anderson, Adam},
  year = {2024},
  month = aug,
  pages = {152--164}
}
Downloads last month
5
Safetensors
Model size
277M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Marijke/UGARIT_hypopt_reduced_NER

Finetuned
(3)
this model