Collections of models and papers for works: "Reinforcement Learning for Reasoning in Large Language Models with One Training Example"
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 95 -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1
Text Generation • 2B • Updated • 1.79k -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi13
Text Generation • 2B • Updated • 1.87k -
ypwang61/One-Shot-RLVR-Qwen2.5-Math-1.5B-pi1_pi13
Text Generation • 2B • Updated • 113