VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published 16 days ago • 56
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos Paper • 2505.23693 • Published 16 days ago • 56
Breaking Down Video LLM Benchmarks: Knowledge, Spatial Perception, or True Temporal Understanding? Paper • 2505.14321 • Published 26 days ago • 10
Fast-dLLM: Training-free Acceleration of Diffusion LLM by Enabling KV Cache and Parallel Decoding Paper • 2505.22618 • Published 17 days ago • 42
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning? Paper • 2505.23359 • Published 17 days ago • 39
The Climb Carves Wisdom Deeper Than the Summit: On the Noisy Rewards in Learning to Reason Paper • 2505.22653 • Published 17 days ago • 66
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence Paper • 2505.23747 • Published 16 days ago • 67
Table-R1: Inference-Time Scaling for Table Reasoning Paper • 2505.23621 • Published 17 days ago • 91