Lj V. Miranda's picture

Lj V. Miranda PRO

ljvmiranda921

·

https://ljvmiranda921.github.io

AI & ML interests

NLP - multilinguality, data-centric AI

Recent Activity

liked a Space 19 days ago

UD-Filipino/filbench-leaderboard

upvoted an article 27 days ago

🇵🇭 FilBench - Can LLMs Understand and Generate Filipino?

published an article 28 days ago

🇵🇭 FilBench - Can LLMs Understand and Generate Filipino?

View all activity

Organizations

upvoted an article 27 days ago

Article

🇵🇭 FilBench - Can LLMs Understand and Generate Filipino?

By

and 8 others •

28 days ago

• 15

upvoted a collection 3 months ago

Reward Bench 2

Datasets, spaces, and models for Reward Bench 2 benchmark and paper! • 11 items • Updated Jun 3 • 14

upvoted 2 papers 4 months ago

R3: Robust Rubric-Agnostic Reward Models

Paper • 2505.13388 • Published May 19 • 11

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29 • 97

upvoted a paper 5 months ago

The Bitter Lesson Learned from 2,000+ Multilingual Benchmarks

Paper • 2504.15521 • Published Apr 22 • 64

upvoted a collection 6 months ago

SEA-VL: Multicultural VL Dataset for Southeast Asia

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia • 3 items • Updated Apr 12 • 19

upvoted a paper 6 months ago

Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia

Paper • 2503.07920 • Published Mar 10 • 100

upvoted 2 papers 8 months ago

Bridging the Data Provenance Gap Across Text, Speech and Video

Paper • 2412.17847 • Published Dec 19, 2024 • 10

2 OLMo 2 Furious

Paper • 2501.00656 • Published Dec 31, 2024 • 21

upvoted a paper 9 months ago

Qwen2.5 Technical Report

Paper • 2412.15115 • Published Dec 19, 2024 • 374

upvoted 2 collections 9 months ago

Multilingual LLM Evaluation

Multilingual Evaluation Benchmarks • 8 items • Updated Jul 31 • 27

SEACrowd: A Multilingual Multimodal Data Hub and Benchmark S

SEACrowd is a community movement project aimed at centralizing and standardizing AI resources for Southeast Asian languages, cultures, and/or regions. • 3 items • Updated Jun 18, 2024 • 8

upvoted a collection 10 months ago

OLMo 2

Artifacts for the OLMo 2 release. • 35 items • Updated May 1 • 138

upvoted a paper 10 months ago

TÜLU 3: Pushing Frontiers in Open Language Model Post-Training

Paper • 2411.15124 • Published Nov 22, 2024 • 66

upvoted a collection 10 months ago

Tulu 3 Datasets

All datasets released with Tulu 3 -- state of the art open post-training recipes. • 33 items • Updated Apr 30 • 88

upvoted a paper 10 months ago

Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback

Paper • 2410.19133 • Published Oct 24, 2024 • 11

upvoted a collection 11 months ago

Multilingual RewardBench (M-RewardBench) [ACL 2025 Main]

Multilingual Reward Model Evaluation Dataset and Results • 3 items • Updated May 15 • 4

upvoted a paper 11 months ago

M-RewardBench: Evaluating Reward Models in Multilingual Settings

Paper • 2410.15522 • Published Oct 20, 2024 • 12

upvoted 2 papers about 1 year ago

SeaLLMs 3: Open Foundation and Chat Multilingual Large Language Models for Southeast Asian Languages

Paper • 2407.19672 • Published Jul 29, 2024 • 59

Consent in Crisis: The Rapid Decline of the AI Data Commons

Paper • 2407.14933 • Published Jul 20, 2024 • 13