Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers Paper • 2506.23918 • Published 24 days ago • 78
VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization Paper • 2505.19000 • Published May 25 • 43
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting Paper • 2505.18822 • Published May 24 • 14
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published May 23 • 89
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published May 17 • 58
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8 • 179
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs Paper • 2504.11536 • Published Apr 15 • 61
FuseChat-3.0: Preference Optimization Meets Heterogeneous Model Fusion Paper • 2503.04222 • Published Mar 6 • 15
VLM^2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues Paper • 2502.12084 • Published Feb 17 • 31