Reasoning over Boundaries: Enhancing Specification Alignment via Test-time Delibration Paper • 2509.14760 • Published 19 days ago • 51
WebExplorer: Explore and Evolve for Training Long-Horizon Web Agents Paper • 2509.06501 • Published 29 days ago • 77
Speed Always Wins: A Survey on Efficient Architectures for Large Language Models Paper • 2508.09834 • Published Aug 13 • 53
Divide and Conquer: Grounding LLMs as Efficient Decision-Making Agents via Offline Hierarchical Reinforcement Learning Paper • 2505.19761 • Published May 26
Text-to-Decision Agent: Offline Meta-Reinforcement Learning from Natural Language Supervision Paper • 2504.15046 • Published Apr 21
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models Paper • 2507.15375 • Published Jul 21 • 26
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction Paper • 2507.02025 • Published Jul 2 • 35
Attention-Guided Contrastive Role Representations for Multi-Agent Reinforcement Learning Paper • 2312.04819 • Published Dec 8, 2023
The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models Paper • 2505.22617 • Published May 28 • 130
CPGD: Toward Stable Rule-based Reinforcement Learning for Language Models Paper • 2505.12504 • Published May 18 • 24
MM-PRM: Enhancing Multimodal Mathematical Reasoning with Scalable Step-Level Supervision Paper • 2505.13427 • Published May 19 • 26