raphaelmerx's picture
Update README.md
db301d6 verified
---
library_name: transformers
tags: []
---
# Tetun BERT model
A fine-tune of [xlm-roberta-large](https://huggingface.co/FacebookAI/xlm-roberta-large) trained on Tetun data with a masked language modelling objective.
Tetun data used: [MADLAD](https://huggingface.co/datasets/allenai/MADLAD-400) tet clean split (~40k documents).
Trained for 10 epochs with hyper params from the [MasakhaNER paper](https://aclanthology.org/2021.tacl-1.66.pdf) (lr 5e-5 etc).