A collection of training corpus and models for "Multilingual Language Model Pretraining using Machine-translated Data".
BritLLM
community
AI & ML interests
Recent Activity
models
5
datasets
18
britllm/TransWebEdu
Updated
•
2.25k
britllm/TransWeb-Edu-English
Viewer
•
Updated
•
36M
•
1.69k
britllm/TransWeb-Edu-Spanish
Viewer
•
Updated
•
35.2M
•
1.87k
•
3
britllm/TransWeb-Edu-French
Viewer
•
Updated
•
36M
•
1.91k
britllm/TransWeb-Edu-German
Viewer
•
Updated
•
36M
•
2.02k
•
1
britllm/xnli_brit
Viewer
•
Updated
•
9.69k
•
140
britllm/piqa_scottish_gaelic
Updated
•
7
britllm/piqa_welsh
Updated
•
104
britllm/piqa_irish
Updated
•
10
britllm/arc_scottish_gaelic
Viewer
•
Updated
•
7.56k
•
37