Zhaoning Yu's picture

4

Zhaoning Yu

ZhaoningYu

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 4 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

upvoted a paper 4 days ago

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

upvoted a paper 11 days ago

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

View all activity

Organizations

None yet

upvoted 2 papers 4 days ago

The Alignment Waltz: Jointly Training Agents to Collaborate for Safety

Paper • 2510.08240 • Published 4 days ago • 34

Hybrid Reinforcement: When Reward Is Sparse, It's Better to Be Dense

Paper • 2510.07242 • Published 5 days ago • 28

upvoted a paper 11 days ago

RESTRAIN: From Spurious Votes to Signals -- Self-Driven RL with Self-Penalization

Paper • 2510.02172 • Published 11 days ago • 7

upvoted a paper about 1 month ago

Jointly Reinforcing Diversity and Quality in Language Model Generations

Paper • 2509.02534 • Published Sep 2 • 25