Reasoning Rating Model

This repository contains the model described in the paper Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models.

Code: https://github.com/opendatalab/Meta-rater

Model Description

This model is a fine-tuned version of ModernBERT-base designed to evaluate the Reasoning dimension of text quality on a 5-point scale (0-5). Reasoning measures the complexity of logical thinking and analysis required from readers, focusing on multi-step reasoning, argument structure, and analytical depth.

Model Details

Base Model: ModernBERT-base
Parameters: 149M
Context Window: 4,096 tokens
Task: Text quality rating (regression)
Score Range: 0-5 (continuous)
Performance: 89.59% F1 score, 96.32% accuracy

Rating Scale

The model uses an additive 5-point rating system:

0: Contain no reasoning content
1: Preliminary reasoning elements with single causal relationships or simple logical judgments, lacking depth
2: Basic reasoning with some logical relationships requiring moderate thought, simple argumentative structures
3: Good reasoning complexity with multiple steps requiring complex thought, interrelated arguments with some depth
4: High reasoning complexity with multi-layered logic and in-depth analysis, comprehensive evaluation required
5: Exceptional reasoning complexity demanding deep analysis and innovative thinking, multi-dimensional and interdisciplinary

Usage

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load the model and tokenizer
model_name = "opendatalab/meta-rater-reasoning-rating"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# Example text
text = "By analyzing the correlation between economic indicators and social outcomes, we can identify causal mechanisms that explain policy effectiveness..."

# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=4096)
with torch.no_grad():
    outputs = model(**inputs)
    score = outputs.logits.squeeze().argmax(dim=0)

print(f"Reasoning Score: {score:.2f}")

Training Details

Training Data: 747,422 examples from SlimPajama dataset
Annotation Model: Llama-3.3-70B-Instruct
Training Epochs: 10
Evaluation Split: 93,428 test examples
Data Split: 8:1:1 (train:dev:test)

Applications

This model is particularly valuable for:

Educational content curation for reasoning skill development
Research paper evaluation and academic content assessment
Data selection for training reasoning-capable language models
Curriculum design for critical thinking courses
Content filtering for analytical and argumentative writing
Quality assessment of logical and analytical texts

What the Model Evaluates

The model assesses various aspects of reasoning complexity:

Logical structure and argument coherence
Multi-step reasoning processes
Causal relationship identification and analysis
Evidence integration and synthesis
Problem-solving approaches and methodologies
Critical analysis depth and sophistication

Key Reasoning Indicators

Simple reasoning: Single cause-effect relationships, basic problem-solution pairs
Moderate reasoning: Multiple interrelated factors, comparative analysis
Complex reasoning: Multi-layered arguments, comprehensive evaluations
Advanced reasoning: Interdisciplinary integration, innovative thinking, mathematical models

What the Model Does NOT Consider

The specific language the text is written in
The length of the text
Usage of placeholders for data privacy or safety
Writing style, grammar, or formatting quality

Limitations

Designed primarily for English text
May not capture domain-specific reasoning patterns equally well across all fields
Performance may vary for highly specialized mathematical or technical reasoning
Should be combined with other quality metrics for comprehensive assessment

Citation

If you use this model in your research, please cite:

@article{zhuang2025meta,
  title={Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models},
  author={Zhuang, Xinlin and Peng, Jiahui and Ma, Ren and Wang, Yinfan and Bai, Tianyi and Wei, Xingjian and Qiu, Jiantao and Zhang, Chi and Qian, Ying and He, Conghui},
  journal={arXiv preprint arXiv:2504.14194},
  year={2025}
}

License

This model is released under the same license as the base ModernBERT model.

Contact

For questions or issues, please contact the authors or open an issue in the repository.

opendatalab
/

meta-rater-reasoning-rating