citrinegui/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_vrex_0.5_0.5_SEC0.99DRO0.0G0.0_minp0.0_1200 Text Generation • 2B • Updated 15 days ago • 14
shubhamprshr/Qwen2.5-1.5B-Instruct_blocksworld1246_grpo_balanced_0.5_0.5_SEC0.3DRO1.0G0.0_minpTrue_1200 Text Generation • 2B • Updated 7 days ago • 7
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_294 Reinforcement Learning • 2B • Updated 3 days ago • 486
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_588 Reinforcement Learning • 2B • Updated 2 days ago • 172
AzalKhan/Qwen2.5-1.5B-Instruct_open-r1-DAPO-Math-17k-Processed_1 Reinforcement Learning • 2B • Updated 4 days ago • 9