Testerpce
's Collections
Multilingual
updated
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large
Language Models in 167 Languages
Paper
•
2309.09400
•
Published
•
85
Tuning LLMs with Contrastive Alignment Instructions for Machine
Translation in Unseen, Low-resource Languages
Paper
•
2401.05811
•
Published
•
8
Is Preference Alignment Always the Best Option to Enhance LLM-Based
Translation? An Empirical Analysis
Paper
•
2409.20059
•
Published
•
17
Are Character-level Translations Worth the Wait? Comparing Character-
and Subword-level Models for Machine Translation
Paper
•
2302.14220
•
Published
Cut Your Losses in Large-Vocabulary Language Models
Paper
•
2411.09009
•
Published
•
50
How Do Multilingual Models Remember? Investigating Multilingual Factual
Recall Mechanisms
Paper
•
2410.14387
•
Published
•
1
Babel: Open Multilingual Large Language Models Serving Over 90% of
Global Speakers
Paper
•
2503.00865
•
Published
•
66
From Bytes to Ideas: Language Modeling with Autoregressive U-Nets
Paper
•
2506.14761
•
Published
•
15
When Life Gives You Samples: The Benefits of Scaling up Inference
Compute for Multilingual LLMs
Paper
•
2506.20544
•
Published
•
10
SambaLingo: Teaching Large Language Models New Languages
Paper
•
2404.05829
•
Published
•
13