-
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_10
0.6B • Updated • 3 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_100
0.6B • Updated • 5 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_110
0.6B • Updated • 5 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_20
0.6B • Updated • 3
Renjie
RogerLos
·
AI & ML interests
LLM
Recent Activity
updated
a model
15 days ago
RogerLos/all_pairs_rft_Qwen25-7B
published
a model
15 days ago
RogerLos/all_pairs_rft_Qwen25-7B
Organizations
None yet
Long_CoT_Degradation_SFT
Checkpoint for Long CoT Degradation
-
RogerLos/curriculum_220k_long-cot_Llama-3.2-1B-Instruct
Text Generation • 1B • Updated • 4 -
RogerLos/curriculum_220k_long-cot_Llama-3.2-3B-Instruct
Text Generation • 3B • Updated • 5 -
RogerLos/curriculum_220k_long-cot_Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 3 -
RogerLos/curriculum_220k_long-cot_Qwen2.5-0.5B-Instruct
Text Generation • 0.5B • Updated • 6
Long_CoT_Degradation_RL
-
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_10
0.6B • Updated • 3 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_100
0.6B • Updated • 5 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_110
0.6B • Updated • 5 -
RogerLos/verl-grpo-128k-Qwen2.5-0.5B-Instruct-global_step_20
0.6B • Updated • 3
Long_CoT_Degradation_SFT
Checkpoint for Long CoT Degradation
-
RogerLos/curriculum_220k_long-cot_Llama-3.2-1B-Instruct
Text Generation • 1B • Updated • 4 -
RogerLos/curriculum_220k_long-cot_Llama-3.2-3B-Instruct
Text Generation • 3B • Updated • 5 -
RogerLos/curriculum_220k_long-cot_Llama-3.1-8B-Instruct
Text Generation • 8B • Updated • 3 -
RogerLos/curriculum_220k_long-cot_Qwen2.5-0.5B-Instruct
Text Generation • 0.5B • Updated • 6
models
495
RogerLos/all_pairs_rft_Qwen25-7B
8B
•
Updated
•
14
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_90
8B
•
Updated
•
11
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_85
8B
•
Updated
•
14
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_80
8B
•
Updated
•
12
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_75
8B
•
Updated
•
12
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_70
8B
•
Updated
•
11
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_65
8B
•
Updated
•
13
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_60
8B
•
Updated
•
17
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_55
8B
•
Updated
•
16
RogerLos/GRPO-GPT5nano-critique-big_math_vanilla_partial_online_math-verify_rft-global_step_50
8B
•
Updated
•
15
datasets
12
RogerLos/open_r1_math_all_sampled_128k
Viewer
•
Updated
•
128k
•
29
RogerLos/open_r1_math_all_sampled_64k
Viewer
•
Updated
•
64k
•
32
RogerLos/open_r1_math_all_sampled_32k
Viewer
•
Updated
•
32k
•
29
RogerLos/open_r1_math_all_sampled_16k
Viewer
•
Updated
•
16k
•
28
RogerLos/open_r1_math_all_sampled_8k
Viewer
•
Updated
•
8k
•
29
RogerLos/open_r1_math_curriculum_220k
Viewer
•
Updated
•
220k
•
28
RogerLos/FCP_big_math_pro_SFT
Viewer
•
Updated
•
384k
•
27
RogerLos/FCP_general_reasoner_pro_SFT
Viewer
•
Updated
•
272k
•
22
RogerLos/FCP_general_reasoner_pro_C-plus_no_concise
Viewer
•
Updated
•
133k
•
22
RogerLos/FCP_big_math_pro_C-plus_no_concise
Viewer
•
Updated
•
185k
•
27