ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published 2 days ago • 15
LMMs-Eval: Reality Check on the Evaluation of Large Multimodal Models Paper • 2407.12772 • Published Jul 17, 2024 • 35
Video-MMMU: Evaluating Knowledge Acquisition from Multi-Discipline Professional Videos Paper • 2501.13826 • Published Jan 23 • 25