-
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models
Paper • 2502.09696 • Published • 44 -
MM-RLHF: The Next Step Forward in Multimodal LLM Alignment
Paper • 2502.10391 • Published • 35 -
Autellix: An Efficient Serving Engine for LLM Agents as General Programs
Paper • 2502.13965 • Published • 19 -
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines
Paper • 2502.14739 • Published • 104
Sangyeon Cho
josang1204
·
AI & ML interests
None yet
Recent Activity
upvoted
an
article
11 days ago
Mitigating False Negatives in Multiple Negatives Ranking Loss for Retriever Training
updated
a collection
3 months ago
llm
liked
a dataset
3 months ago
li-lab/MMLU-ProX