wongyukim
wongyukim
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 5 hours ago
Sample More to Think Less: Group Filtered Policy Optimization for
Concise Reasoning
upvoted
a
paper
about 5 hours ago
Cooper: Co-Optimizing Policy and Reward Models in Reinforcement Learning
for Large Language Models
upvoted
a
paper
about 5 hours ago
AWorld: Dynamic Multi-Agent System with Stable Maneuvering for Robust
GAIA Problem Solving
Organizations
None yet