🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated about 19 hours ago • 1
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published 1 day ago • 25
🐙 OctoThinker Collection Mid-training Incentivizes Reinforcement Learning Scaling • 18 items • Updated about 19 hours ago • 1
🧙 Guru Collection Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective • 4 items • Updated 7 days ago
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published 9 days ago • 42
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective Paper • 2506.14965 • Published 9 days ago • 42
ScienceBoard: Evaluating Multimodal Autonomous Agents in Realistic Scientific Workflows Paper • 2505.19897 • Published May 26 • 102
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping Paper • 2505.15612 • Published May 21 • 33
Scaling Computer-Use Grounding via User Interface Decomposition and Synthesis Paper • 2505.13227 • Published May 19 • 45