TransWebLLM
Collection
A collection of training corpus and models for "Multilingual Language Model Pretraining using Machine-translated Data".
•
4 items
•
Updated
This is a raw, pretrained multilingual language model, supporting Arabic, Welsh, German, English, Spanish, French, Indonesian, Italian, Russian, and Swahili. The model is pretrained from scratch, which should be further finetuned for most use cases.
For more details: Multilingual Language Model Pretraining using Machine-translated Data
Contact
Email: [email protected]