license: mit | |
language: | |
- en | |
- de | |
- fr | |
- fi | |
- sv | |
- nl | |
# hmByT5 - Preliminary Language Models | |
Preliminary Historic Multilingual and Monolingual ByT5 Models. Following languages are currently covered: | |
* English (British Library Corpus - Books) | |
* German (Europeana Newspaper) | |
* French (Europeana Newspaper) | |
* Finnish (Europeana Newspaper) | |
* Swedish (Europeana Newspaper) | |
* Dutch (Delpher Corpus) | |
More details can be found in [our GitHub repository](https://github.com/stefan-it/hmByT5). | |
# Pretraining | |
We use the official JAX/FLAX example in Hugging Face Transformers to pretrain a ByT5 model on a single v3-8 TPU. | |
Details about the training can be found [here](https://github.com/stefan-it/hmByT5/tree/main/hmbyt5-flax). | |
This model was trained with `mean_noise_span_length=20` for one epoch. | |
# Evaluation on Downstream Tasks (NER) | |
See detailed results at [hmLeaderboard](https://huggingface.co/spaces/stefan-it/hmLeaderboard). | |
# Acknowledgements | |
Research supported with Cloud TPUs from Google's [TPU Research Cloud](https://sites.research.google/trc/about/) (TRC). | |
Many Thanks for providing access to the TPUs ❤️ | |