GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning Paper • 2507.01006 • Published 2 days ago • 157
SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks Paper • 2507.01001 • Published 2 days ago • 35
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published 3 days ago • 44
SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning Paper • 2506.24119 • Published 3 days ago • 36
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published 10 days ago • 27
MultiFinBen: A Multilingual, Multimodal, and Difficulty-Aware Benchmark for Financial LLM Evaluation Paper • 2506.14028 • Published 17 days ago • 90
GenRecal: Generation after Recalibration from Large to Small Vision-Language Models Paper • 2506.15681 • Published 15 days ago • 36
VeriThinker: Learning to Verify Makes Reasoning Model Efficient Paper • 2505.17941 • Published May 23 • 24
Distilling LLM Agent into Small Models with Retrieval and Code Tools Paper • 2505.17612 • Published May 23 • 78
VisionReasoner: Unified Visual Perception and Reasoning via Reinforcement Learning Paper • 2505.12081 • Published May 17 • 17
The CoT Encyclopedia: Analyzing, Predicting, and Controlling how a Reasoning Model will Think Paper • 2505.10185 • Published May 15 • 25
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Paper • 2504.20157 • Published Apr 28 • 37