wu's picture

2 1

wu

jkwudeeplearning

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

START: Self-taught Reasoner with Tools

upvoted an article 25 days ago

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

liked a Space 2 months ago

HuggingFaceH4/blogpost-scaling-test-time-compute

View all activity

Organizations

None yet

jkwudeeplearning's activity

upvoted a paper 3 days ago

START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published 3 days ago • 76

upvoted an article 25 days ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

By

•

about 1 month ago

• 63

liked a Space 2 months ago

Scaling test-time compute

Enhance math problem solving by scaling test-time compute