DAPO: An Open-Source LLM Reinforcement Learning System at Scale Paper • 2503.14476 • Published 29 days ago • 117
VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 9 days ago • 24
SQL-R1: Training Natural Language to SQL Reasoning Model By Reinforcement Learning Paper • 2504.08600 • Published 5 days ago • 21