DeepSearch: Overcome the Bottleneck of Reinforcement Learning with Verifiable Rewards via Monte Carlo Tree Search Paper • 2509.25454 • Published 8 days ago • 119
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published 9 days ago • 6
AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play Paper • 2509.24193 • Published 9 days ago • 6 • 2