Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training Paper • 2508.00414 • Published 4 days ago • 51
Reasoning or Memorization? Unreliable Results of Reinforcement Learning Due to Data Contamination Paper • 2507.10532 • Published 22 days ago • 83
Towards Solving More Challenging IMO Problems via Decoupled Reasoning and Proving Paper • 2507.06804 • Published 28 days ago • 15
Group Think: Multiple Concurrent Reasoning Agents Collaborating at Token Level Granularity Paper • 2505.11107 • Published May 16 • 29
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation Paper • 2505.10962 • Published May 16 • 8
MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation Paper • 2505.10962 • Published May 16 • 8
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 134
DeepMath-103K: A Large-Scale, Challenging, Decontaminated, and Verifiable Mathematical Dataset for Advancing Reasoning Paper • 2504.11456 • Published Apr 15 • 13
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published Mar 31 • 24
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published Mar 31 • 24
Running 2.94k 2.94k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs Paper • 2501.18585 • Published Jan 30 • 61
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Paper • 2410.06508 • Published Oct 9, 2024 • 11
Towards Self-Improvement of LLMs via MCTS: Leveraging Stepwise Knowledge with Curriculum Preference Learning Paper • 2410.06508 • Published Oct 9, 2024 • 11
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search Paper • 2410.03864 • Published Oct 4, 2024 • 12
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows Paper • 2409.17433 • Published Sep 25, 2024 • 9
HDFlow: Enhancing LLM Complex Problem-Solving with Hybrid Thinking and Dynamic Workflows Paper • 2409.17433 • Published Sep 25, 2024 • 9