guox18's picture

3 5 8

guox18

guox18

·

guox18

AI & ML interests

Alignment

Recent Activity

upvoted a paper about 1 month ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

authored a paper about 2 months ago

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR

authored a paper about 2 months ago

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

View all activity

Organizations

None yet

guox18 's collections 1