16 48 16

Yuzhen Huang

yuzhen17

https://hyz17.github.io

HYZ17

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

upvoted a paper 10 days ago

Qwen-Image Technical Report

upvoted a paper 10 days ago

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

View all activity

Organizations

upvoted a paper 3 days ago

Part I: Tricks or Traps? A Deep Dive into RL for LLM Reasoning

Paper • 2508.08221 • Published 4 days ago • 30

upvoted 3 papers 10 days ago

Qwen-Image Technical Report

Paper • 2508.02324 • Published 12 days ago • 186

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published 15 days ago • 106

CompassVerifier: A Unified and Robust Verifier for LLMs Evaluation and Outcome Reward

Paper • 2508.03686 • Published 10 days ago • 32

upvoted a paper 14 days ago

Can One Domain Help Others? A Data-Centric Study on Multi-Domain Reasoning via Reinforcement Learning

Paper • 2507.17512 • Published 24 days ago • 36

upvoted an article 17 days ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

and 4 others •

18 days ago

• 153

upvoted 2 papers 17 days ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published 23 days ago • 289

Agentic Reinforced Policy Optimization

Paper • 2507.19849 • Published 21 days ago • 139

upvoted a paper 25 days ago

Stabilizing Knowledge, Promoting Reasoning: Dual-Token Constraints for RLVR

Paper • 2507.15778 • Published 25 days ago • 19

upvoted a paper 28 days ago

SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories?

Paper • 2507.12415 • Published about 1 month ago • 41

upvoted 3 papers about 1 month ago

NeuralOS: Towards Simulating Operating Systems via Neural Generative Models

Paper • 2507.08800 • Published Jul 11 • 79

Scaling RL to Long Videos

Paper • 2507.07966 • Published Jul 10 • 155

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30 • 86

upvoted 3 papers about 2 months ago

Where to find Grokking in LLM Pretraining? Monitor Memorization-to-Generalization without Test

Paper • 2506.21551 • Published Jun 26 • 28

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 46

Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs

Paper • 2506.19290 • Published Jun 24 • 50

upvoted 2 papers 2 months ago

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 255

Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models

Paper • 2506.05176 • Published Jun 5 • 68

upvoted 2 papers 3 months ago

The Entropy Mechanism of Reinforcement Learning for Reasoning Language Models

Paper • 2505.22617 • Published May 28 • 128

SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond

Paper • 2505.19641 • Published May 26 • 67

Yuzhen Huang

AI & ML interests

Recent Activity

Organizations

yuzhen17's activity

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face