Multimodal Spatial Reasoning in the Large Model Era: A Survey and Benchmarks Paper • 2510.25760 • Published 3 days ago • 16
VITA-E: Natural Embodied Interaction with Concurrent Seeing, Hearing, Speaking, and Acting Paper • 2510.21817 • Published 11 days ago • 41
SpatialLadder: Progressive Training for Spatial Reasoning in Vision-Language Models Paper • 2510.08531 • Published 23 days ago • 11
GSM8K-V: Can Vision Language Models Solve Grade School Math Word Problems in Visual Contexts Paper • 2509.25160 • Published Sep 29 • 30
EasySteer: A Unified Framework for High-Performance and Extensible LLM Steering Paper • 2509.25175 • Published Sep 29 • 29
ReSum: Unlocking Long-Horizon Search Intelligence via Context Summarization Paper • 2509.13313 • Published Sep 16 • 77
WebResearcher: Unleashing unbounded reasoning capability in Long-Horizon Agents Paper • 2509.13309 • Published Sep 16 • 67
Towards General Agentic Intelligence via Environment Scaling Paper • 2509.13311 • Published Sep 16 • 70
WebSailor-V2: Bridging the Chasm to Proprietary Agents via Synthetic Data and Scalable Reinforcement Learning Paper • 2509.13305 • Published Sep 16 • 89
WebWeaver: Structuring Web-Scale Evidence with Dynamic Outlines for Open-Ended Deep Research Paper • 2509.13312 • Published Sep 16 • 105
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 184
POINTS-Reader: Distillation-Free Adaptation of Vision-Language Models for Document Conversion Paper • 2509.01215 • Published Sep 1 • 50
Baichuan-M2: Scaling Medical Capability with Large Verifier System Paper • 2509.02208 • Published Sep 2 • 41
UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning Paper • 2509.02544 • Published Sep 2 • 123
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published Sep 2 • 219
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning for Large Language Models Paper • 2508.05613 • Published Aug 7 • 17