SEACrowd: A Multilingual Multimodal Data Hub and Benchmark Suite for Southeast Asian Languages Paper • 2406.10118 • Published Jun 14, 2024 • 33
Towards Understanding the Fragility of Multilingual LLMs against Fine-Tuning Attacks Paper • 2410.18210 • Published Oct 23, 2024
Sailor2: Sailing in South-East Asia with Inclusive Multilingual LLMs Paper • 2502.12982 • Published Feb 18 • 18
Preference Tuning For Toxicity Mitigation Generalizes Across Languages Paper • 2406.16235 • Published Jun 23, 2024 • 11
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts Paper • 2202.01279 • Published Feb 2, 2022
Crosslingual Generalization through Multitask Finetuning Paper • 2211.01786 • Published Nov 3, 2022 • 2
LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons Paper • 2402.14086 • Published Feb 21, 2024 • 11
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model Paper • 2402.07827 • Published Feb 12, 2024 • 49
No Language Left Behind: Scaling Human-Centered Machine Translation Paper • 2207.04672 • Published Jul 11, 2022 • 2
Neural Generation Meets Real People: Building a Social, Informative Open-Domain Dialogue Agent Paper • 2207.12021 • Published Jul 25, 2022
SeamlessM4T-Massively Multilingual & Multimodal Machine Translation Paper • 2308.11596 • Published Aug 22, 2023 • 1
Seamless: Multilingual Expressive and Streaming Speech Translation Paper • 2312.05187 • Published Dec 8, 2023 • 14
Toward Joint Language Modeling for Speech Units and Text Paper • 2310.08715 • Published Oct 12, 2023 • 10
BLOOM+1: Adding Language Support to BLOOM for Zero-Shot Prompting Paper • 2212.09535 • Published Dec 19, 2022 • 1