SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond Paper • 2505.19641 • Published 3 days ago • 41
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond Paper • 2505.19641 • Published 3 days ago • 41
hkust-nlp/Qwen-2.5-7B-Verifier-general-verifier Reinforcement Learning • Updated about 12 hours ago
hkust-nlp/Qwen-2.5-7B-Verifier-general-verifier Reinforcement Learning • Updated about 12 hours ago
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Verifier-1.5B Reinforcement Learning • Updated about 12 hours ago
hkust-nlp/Qwen-2.5-7B-Verifier-R1-Verifier-1.5B Reinforcement Learning • Updated about 12 hours ago
Learn to Reason Efficiently with Adaptive Length-based Reward Shaping Paper • 2505.15612 • Published 7 days ago • 31
RL-Verifier-Pitfalls Collection The collection for the Paper "Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning." • 7 items • Updated 3 days ago
RL-Verifier-Pitfalls Collection The collection for the Paper "Pitfalls of Rule- and Model-based Verifiers: A Case Study on Mathematical Reasoning." • 7 items • Updated 3 days ago