A Survey of Context Engineering for Large Language Models Paper • 2507.13334 • Published 14 days ago • 221
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39
STITCH: Simultaneous Thinking and Talking with Chunked Reasoning for Spoken Language Models Paper • 2507.15375 • Published 11 days ago • 25
Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning Paper • 2507.17512 • Published 9 days ago • 33
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization Paper • 2507.14683 • Published 12 days ago • 117
ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published Jun 23 • 28
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39 • 8
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39 • 8
Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published about 1 month ago • 72
AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) • 7 items • Updated 10 days ago • 14
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39 • 8
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39 • 8
AceReason-Nemotron 1.1: Advancing Math and Code Reasoning through SFT and RL Synergy Paper • 2506.13284 • Published Jun 16 • 24
Reinforcement Learning with Verifiable Rewards Implicitly Incentivizes Correct Reasoning in Base LLMs Paper • 2506.14245 • Published Jun 17 • 39 • 8