Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
0-hero
's Collections
R1-GRPO-Math-Python-Code-Experiments
Prompt Perfect
GPT-2 Experiment
Matter-0.1
Matter 0.2
R1-GRPO-Math-Python-Code-Experiments
updated
May 11
Lora & full finetune experiments on r1 distills to generate python code for math problems
Upvote
-
0-hero/r1-7B-grpo-v3.3-epoch-3
8B
•
Updated
Mar 28
•
1
0-hero/r1-7B-grpo-v3.3-epoch-2
8B
•
Updated
Mar 28
•
2
0-hero/r1-7B-grpo-v3.3-epoch-1
8B
•
Updated
Mar 28
•
1
0-hero/r1-7B-grpo-v3.2-epoch-1
8B
•
Updated
Mar 27
•
1
0-hero/r1-7B-grpo-v3.2-epoch-2
8B
•
Updated
Mar 27
•
1
0-hero/r1-14B-grpo-v3.1-epoch-2
15B
•
Updated
Mar 26
•
1
0-hero/r1-14B-grpo-v3.1-epoch-1
15B
•
Updated
Mar 26
•
1
0-hero/r1-7B-grpo-v3.1-epoch-3
8B
•
Updated
Mar 24
•
1
0-hero/r1-7B-grpo-v3.1-epoch-2
8B
•
Updated
Mar 24
•
1
0-hero/r1-7B-grpo-v2-temp-1.0-60
8B
•
Updated
Mar 23
•
2
0-hero/r1-14B-math-grpo-165
15B
•
Updated
Mar 12
•
1
0-hero/r1-14B-math-grpo-80
15B
•
Updated
Mar 11
•
1
0-hero/r1-7B-grpo-850
8B
•
Updated
Mar 10
•
2
0-hero/r1-7B-grpo-710
8B
•
Updated
Mar 10
•
1
0-hero/r1-7B-grpo-610
8B
•
Updated
Mar 10
•
1
0-hero/r1-7B-grpo-80
8B
•
Updated
Mar 10
•
1
0-hero/R1-7B-MATH-GRPO-FULL
8B
•
Updated
Mar 9
•
1
0-hero/R1-14B-GRPO
15B
•
Updated
Mar 8
•
3
0-hero/r1-7b-grpo-full
8B
•
Updated
Mar 6
•
2
0-hero/r1-8b-grpo-full
Updated
Mar 6
Upvote
-
Share collection
View history
Collection guide
Browse collections