Haolin Liu's picture

5

Haolin Liu

lhl616

AI & ML interests

None yet

Recent Activity

upvoted a paper 5 days ago

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

upvoted a paper 13 days ago

Self-Rewarding Vision-Language Model via Reasoning Decomposition

upvoted a paper about 1 month ago

Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback

View all activity

Organizations

None yet

upvoted a paper 5 days ago

Beyond Correctness: Harmonizing Process and Outcome Rewards through RL Training

Paper • 2509.03403 • Published 6 days ago • 20

upvoted a paper 13 days ago

Self-Rewarding Vision-Language Model via Reasoning Decomposition

Paper • 2508.19652 • Published 14 days ago • 82

upvoted 2 papers about 1 month ago

Towards Optimal Regret in Adversarial Linear MDPs with Bandit Feedback

Paper • 2310.11550 • Published Oct 17, 2023 • 1

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 125

upvoted a paper about 2 months ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31