Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
yifanzhang114 's Collections
R1-Reward
MM-RLHF
SliME
MME-RealWorld

R1-Reward

updated 4 days ago

Training Multimodal Reward Model Through Stable Reinforcement Learning

Upvote
-

  • yifanzhang114/R1-Reward-RL

    Viewer • Updated about 21 hours ago • 17.3k • 72

  • yifanzhang114/R1-Reward

    Updated about 21 hours ago • 53 • 3

  • R1-Reward: Training Multimodal Reward Model Through Stable Reinforcement Learning

    Paper • 2505.02835 • Published 4 days ago • 22
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs