VerIPO: Cultivating Long Reasoning in Video-LLMs via Verifier-Gudied Iterative Policy Optimization Paper • 2505.19000 • Published 17 days ago • 42
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting Paper • 2505.18822 • Published 18 days ago • 14
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting Paper • 2505.18822 • Published 18 days ago • 14
AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting Paper • 2505.18822 • Published 18 days ago • 14 • 2
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning Paper • 2505.17667 • Published 19 days ago • 87
AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published 25 days ago • 57
Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models Paper • 2505.04921 • Published May 8 • 176