OpenRubrics: Towards Scalable Synthetic Rubric Generation for Reward Modeling and LLM Alignment
Abstract
Rubric-based reward models using OpenRubrics and Contrastive Rubric Generation improve alignment in reinforcement learning from human feedback by providing scalable and reliable evaluation signals.
Reward modeling lies at the core of reinforcement learning from human feedback (RLHF), yet most existing reward models rely on scalar or pairwise judgments that fail to capture the multifaceted nature of human preferences. Recent studies have explored rubrics-as-rewards (RaR) that uses structured natural language criteria that capture multiple dimensions of response quality. However, producing rubrics that are both reliable and scalable remains a key challenge. In this work, we introduce OpenRubrics, a diverse, large-scale collection of (prompt, rubric) pairs for training rubric-generation and rubric-based reward models. To elicit discriminative and comprehensive evaluation signals, we introduce Contrastive Rubric Generation (CRG), which derives both hard rules (explicit constraints) and principles (implicit qualities) by contrasting preferred and rejected responses. We further improve reliability by enforcing preference-label consistency via rejection sampling to remove noisy rubrics. Across multiple reward-modeling benchmarks, our rubric-based reward model, Rubric-RM, surpasses strong size-matched baselines by 6.8%. These gains transfer to policy models on instruction-following and biomedical benchmarks. Our results show that rubrics provide scalable alignment signals that narrow the gap between costly human evaluation and automated reward modeling, enabling a new principle-driven paradigm for LLM alignment.
Community
✨We introduce OpenRubrics, a scalable framework & dataset for structured rubric synthesis, and train Rubric-RM, a rubric-based reward model. Across multiple benchmarks, Rubric-RM outperforms strong size-matched baselines by +6.8%, and the gains transfer to policy models on instruction-following and biomedical tasks (+2.9% on average).
🗝️ Key ideas: separate hard rules and principles, use Contrastive Rubric Generation (CRG), and enforce preference–label consistency to reduce noise and improve interpretability at scale.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- mR3: Multilingual Rubric-Agnostic Reward Reasoning Models (2025)
- Beyond Monolithic Rewards: A Hybrid and Multi-Aspect Reward Optimization for MLLM Alignment (2025)
- ACE-RL: Adaptive Constraint-Enhanced Reward for Long-form Generation Reinforcement Learning (2025)
- Reinforcement Learning with Rubric Anchors (2025)
- RLBFF: Binary Flexible Feedback to bridge between Human Feedback&Verifiable Rewards (2025)
- S2J: Bridging the Gap Between Solving and Judging Ability in Generative Reward Models (2025)
- A Survey of Process Reward Models: From Outcome Signals to Process Supervisions for Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 4
Datasets citing this paper 1
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper