MixCPT Collection Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources • 40 items • Updated 14 days ago • 1
EMMA-500: Enhancing Massively Multilingual Adaptation of Large Language Models Paper • 2409.17892 • Published Sep 26, 2024 • 2