MaLA Corpus for Massive Language Adaptation of Large Language Models https://mala-lm.github.io

MaLA-LM
community
AI & ML interests
NLP & LLM
Recent Activity
View all activity
Organization Card
Welcome to MaLA-LM (Massive Language Adaptation of Large Language Models)! 🌍
MaLA-LM focuses on adapting large language models to support hundreds of languages, including many underrepresented ones. Our models are multilingual, scalable, and optimized for diverse linguistic tasks.
Featured 🗣️
Check out our multilingual LLM collections, featuring models trained to handle 500+ languages, ideal for global, multilingual applications.
Dive into the collections: EMMA-500 | MaLA corpus | MaLA-500
Join our Discord server 👋
https://discord.com/invite/F5mEb7U6we
Happy building! 🚀
Collections
5
Enhancing massively multilingual adaptation of LLMs on 500+ languages https://mala-lm.github.io
-
Massively Multilingual Adaptation of Large Language Models Using Bilingual Translation Data
Paper • 2506.00469 • Published • 2 -
MaLA-LM/emma-500-llama3-8b-mono
Text Generation • Updated • 27 -
MaLA-LM/emma-500-llama3-8b-bi
Text Generation • Updated • 48 -
MaLA-LM/emma-500-llama3.1-8b-mono
Text Generation • Updated • 37
models
59

MaLA-LM/emma-500-llama3.1-8b-bi
Text Generation
•
Updated
•
70

MaLA-LM/emma-500-llama3-8b-bi
Text Generation
•
Updated
•
48

MaLA-LM/emma-500-llama3-8b-mono
Text Generation
•
Updated
•
27

MaLA-LM/emma-500-llama3.1-8b-mono
Text Generation
•
Updated
•
37

MaLA-LM/lucky52-bloom-7b1-no-3
Text Generation
•
Updated
•
21

MaLA-LM/lucky52-bloom-7b1-no-2
Text Generation
•
Updated
•
67

MaLA-LM/lucky52-bloom-7b1-no-4
Text Generation
•
Updated
•
56

MaLA-LM/lucky52-bloom-7b1-no-5
Text Generation
•
Updated
•
31

MaLA-LM/lucky52-bloom-7b1-no-6
Text Generation
•
Updated
•
26

MaLA-LM/lucky52-bloom-7b1-no-8
Text Generation
•
Updated
•
27
datasets
13
MaLA-LM/mala-opus-dedup-2410
Viewer
•
Updated
•
44.3B
•
12.3k
•
1
MaLA-LM/mala-code-reasoning-v2
Viewer
•
Updated
•
89.7M
•
191
•
2
MaLA-LM/mala-code-reasoning
Viewer
•
Updated
•
44.9M
•
118
•
1
MaLA-LM/mala-monolingual-split
Viewer
•
Updated
•
538M
•
7.11k
•
2
MaLA-LM/mala-monolingual-filter
Viewer
•
Updated
•
1.42B
•
9.54k
•
2
MaLA-LM/mala-monolingual-integration
Viewer
•
Updated
•
1.14B
•
1.63k
•
2
MaLA-LM/mala-monolingual-dedup
Viewer
•
Updated
•
969M
•
8.12k
•
2
MaLA-LM/mala-bilingual-translation-corpus
Viewer
•
Updated
•
14.4B
•
1.95k
•
5
MaLA-LM/mala-opus-dedup-2410-sample
Viewer
•
Updated
•
6.48B
•
526
MaLA-LM/mala-opus-dedup-shuffle-2410
Preview
•
Updated
•
3.73k