Renjie's picture

2 17 2

Renjie

RogerLos

·

AI & ML interests

LLM

Recent Activity

upvoted a paper 8 days ago

GoRL: An Algorithm-Agnostic Framework for Online Reinforcement Learning with Generative Policies

updated a model 15 days ago

RogerLos/all_pairs_rft_Qwen25-7B

published a model 15 days ago

RogerLos/all_pairs_rft_Qwen25-7B

View all activity

Organizations

None yet

Collections 2

Papers 2

arxiv:2506.07712

arxiv:2402.14008

models 495

RogerLos/all_pairs_rft_Qwen25-7B

8B • Updated 15 days ago • 14

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_90

8B • Updated 19 days ago • 11

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_85

8B • Updated 19 days ago • 14

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_80

8B • Updated 19 days ago • 12

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_75

8B • Updated 19 days ago • 12

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_70

8B • Updated 19 days ago • 11

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_65

8B • Updated 19 days ago • 13

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_60

8B • Updated 19 days ago • 17

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_55

8B • Updated 19 days ago • 16

RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_50

8B • Updated 19 days ago • 15

View 495 models

datasets 12

RogerLos/open_r1_math_all_sampled_128k

Viewer • Updated 29 days ago • 128k • 29

RogerLos/open_r1_math_all_sampled_64k

Viewer • Updated 29 days ago • 64k • 32

RogerLos/open_r1_math_all_sampled_32k

Viewer • Updated 29 days ago • 32k • 29

RogerLos/open_r1_math_all_sampled_16k

Viewer • Updated 29 days ago • 16k • 28

RogerLos/open_r1_math_all_sampled_8k

Viewer • Updated 29 days ago • 8k • 29

RogerLos/open_r1_math_curriculum_220k

Viewer • Updated 29 days ago • 220k • 28

RogerLos/FCP_big_math_pro_SFT

Viewer • Updated Sep 26 • 384k • 27

RogerLos/FCP_general_reasoner_pro_SFT

Viewer • Updated Sep 26 • 272k • 22

RogerLos/FCP_general_reasoner_pro_C-plus_no_concise

Viewer • Updated Sep 25 • 133k • 22

RogerLos/FCP_big_math_pro_C-plus_no_concise

Viewer • Updated Sep 25 • 185k • 27

View 12 datasets