view article Article šµš FilBench - Can LLMs Understand and Generate Filipino? By ljvmiranda921 and 8 others ⢠28 days ago ⢠15
Reward Bench 2 Collection Datasets, spaces, and models for Reward Bench 2 benchmark and paper! ⢠11 items ⢠Updated Jun 3 ⢠14
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper ⢠2504.20571 ⢠Published Apr 29 ⢠97
The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks Paper ⢠2504.15521 ⢠Published Apr 22 ⢠64
SEA-VL: Multicultural VL Dataset for Southeast Asia Collection Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia ⢠3 items ⢠Updated Apr 12 ⢠19
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia Paper ⢠2503.07920 ⢠Published Mar 10 ⢠100
Bridging the Data Provenance Gap Across Text, Speech and Video Paper ⢠2412.17847 ⢠Published Dec 19, 2024 ⢠10
Multilingual LLM Evaluation Collection Multilingual Evaluation Benchmarks ⢠8 items ⢠Updated Jul 31 ⢠27
SEACrowd: A Multilingual Multimodal Data Hub and Benchmark S Collection SEACrowd is a community movement project aimed at centralizing and standardizing AI resources for Southeast Asian languages, cultures, and/or regions. ⢠3 items ⢠Updated Jun 18, 2024 ⢠8
TĆLU 3: Pushing Frontiers in Open Language Model Post-Training Paper ⢠2411.15124 ⢠Published Nov 22, 2024 ⢠66
Tulu 3 Datasets Collection All datasets released with Tulu 3 -- state of the art open post-training recipes. ⢠33 items ⢠Updated Apr 30 ⢠88
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback Paper ⢠2410.19133 ⢠Published Oct 24, 2024 ⢠11
Multilingual RewardBench (M-RewardBench) [ACL 2025 Main] Collection Multilingual Reward Model Evaluation Dataset and Results ⢠3 items ⢠Updated May 15 ⢠4
M-RewardBench: Evaluating Reward Models in Multilingual Settings Paper ⢠2410.15522 ⢠Published Oct 20, 2024 ⢠12
SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages Paper ⢠2407.19672 ⢠Published Jul 29, 2024 ⢠59
Consent in Crisis: The Rapid Decline of the AI Data Commons Paper ⢠2407.14933 ⢠Published Jul 20, 2024 ⢠13