reproducing DeepSeek R1 Zero with Qwen2.5-0.5B on two 4090 GPUs
rasdani
rasdani
AI & ML interests
None yet
Recent Activity
updated
a dataset
about 6 hours ago
rasdani/swe-fixer-4k-token-limit-sorted-2k
published
a dataset
about 6 hours ago
rasdani/swe-fixer-4k-token-limit-sorted-2k
updated
a dataset
about 6 hours ago
rasdani/swe-fixer-4k-token-limit-sorted
Organizations
Collections
1
Papers
1
models
23

rasdani/qwen3_0_6b_function_rm
Updated
•
14

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-8192k
Updated
•
1

rasdani/Qwen2.5-0.5B-simpleRL-Zoo
Text Generation
•
Updated
•
7

rasdani/smolR1-Qwen2.5-0.5B
Text Generation
•
Updated
•
8

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-no-KL
Updated

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-3072k
Updated

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-4096k
Updated

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-2560k
Updated

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-2048k
Updated

rasdani/Qwen2.5-0.5B-simpleRL-Zoo-first-try
Updated
•
2
datasets
101
rasdani/swe-fixer-4k-token-limit-sorted-2k
Viewer
•
Updated
•
2k
rasdani/swe-fixer-4k-token-limit-sorted
Viewer
•
Updated
•
54.3k
rasdani/swe-fixer-4k-token-limit
Viewer
•
Updated
•
54.3k
rasdani/swe-fixer-70k-filtered
Viewer
•
Updated
•
54.3k
•
58
rasdani/ifeval-genesys
Viewer
•
Updated
•
15k
•
41
rasdani/ifeval-genesys-verified
Viewer
•
Updated
•
48
•
40
rasdani/ifeval-genesys-debug
Viewer
•
Updated
•
48
•
43
rasdani/swe-fixer-debug-DeepSeek-R1-verified
Viewer
•
Updated
•
30
•
141
rasdani/swe-fixer-debug-DeepSeek-R1
Viewer
•
Updated
•
30
•
55
rasdani/swe-fixer-70k
Viewer
•
Updated
•
69.8k
•
130