a-F1/math-qwen2.5-3b-reinforce-moa-3x1-unshared-actor_lr7.5e-7-epoch2-modenull-cftrue-decayepsfalse 3B • Updated Jul 10 • 2
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-beam_search-prm-completions Viewer • Updated May 10 • 4 • 6
a-F1/aime_2024-DeepSeek-R1-Distill-Qwen-1.5B-best_of_n-prm-completions Viewer • Updated May 10 • 4 • 9
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions Updated May 9 • 3
a-F1/DeepSeek-R1-Distill-Qwen-1.5B-Llama3.1-8B-PRM-Deepseek-Data-beam_search-prm-completions Viewer • Updated May 8 • 8 • 8
a-F1/DeepSeek-R1-Distill-Qwen-7B-Llama3.1-8B-PRM-Deepseek-Data-best_of_n-prm-completions Viewer • Updated May 7 • 7 • 12