reasoning-project

community

AI & ML interests

None defined yet.

Recent Activity

Cartinoe5930 authored a paper 19 days ago

Pushing on Multilingual Reasoning Models with Language-Mixed Chain-of-Thought

JW17 authored a paper 4 months ago

AlphaPO -- Reward shape matters for LLM alignment

JW17 authored a paper 4 months ago

Online Difficulty Filtering for Reasoning Oriented Reinforcement Learning

View all activity

models 4

reasoning-project/Q25M-1.5B-MR1-50k-SFT-v0.2-3epoch

Text Generation • 2B • Updated Feb 16 • 1

reasoning-project/Q25M-1.5B-Open-R1-55k-SFT-v0.1

Text Generation • 2B • Updated Feb 15 • 1

reasoning-project/Q25-1.5B-PRIME-55K-GRPO-Acc2-format5e1

reasoning-project/Q25-1.5B-Open-R1-55K-GRPO-Acc2-format5e1

datasets 0

None public yet