Papers
arxiv:2510.17314

Auto-Rubric: Learning to Extract Generalizable Criteria for Reward Modeling

Published on Oct 20
Authors:
,
,
,
,
,
,
,
,
,
,
,

Abstract

A novel training-free framework infers and generalizes human preference rubrics for reward modeling, achieving high data efficiency and performance with minimal preference data.

AI-generated summary

Reward models are essential for aligning Large Language Models (LLMs) with human values, yet their development is hampered by costly preference datasets and poor interpretability. While recent rubric-based approaches offer transparency, they often lack systematic quality control and optimization, creating a trade-off between scalability and reliability. We address these limitations with a novel, training-free framework built on a key assumption: evaluation rubrics underlying human preferences exhibit significant generalization ability across diverse queries, a property that enables remarkable data efficiency. Our two-stage approach first infers high-quality, query-specific rubrics using a validation-guided Propose-Evaluate-Revise pipeline. Second, it generalizes these granular rubrics into a compact, non-redundant core set by maximizing an information-theoretic coding rate. The final output is an interpretable, hierarchical "Theme-Tips" rubric set. Extensive experiments demonstrate the framework's exceptional data efficiency and performance. Critically, using just 70 preference pairs (1.5\% of the source data), our method also empowers smaller models like Qwen3-8B to outperform specialized, fully-trained counterparts. This work pioneers a scalable, interpretable, and data-efficient path for reward modeling.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2510.17314 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2510.17314 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.