ViCrit: A Verifiable Reinforcement Learning Proxy Task for Visual Perception in VLMs Paper • 2506.10128 • Published Jun 11 • 23 • 2
MORSE-500: A Programmatically Controllable Video Benchmark to Stress-Test Multimodal Reasoning Paper • 2506.05523 • Published Jun 5 • 34 • 2
SoTA with Less: MCTS-Guided Sample Selection for Data-Efficient Visual Reasoning Self-Improvement Paper • 2504.07934 • Published Apr 10 • 20 • 2
Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension Paper • 2412.03704 • Published Dec 4, 2024 • 7 • 2