Penghong Zhao's picture

1

Penghong Zhao

DDDDrop

drop-hell

AI & ML interests

RL，Multimodal，Machine Learninh

Recent Activity

authored a paper about 12 hours ago

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

upvoted a paper 9 days ago

PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

View all activity

Organizations

None yet

Papers 1

arxiv:2508.21104

models 0

None public yet

datasets 0

None public yet