Reasoning Rating Model
This repository contains the model described in the paper Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models.
Code: https://github.com/opendatalab/Meta-rater
Model Description
This model is a fine-tuned version of ModernBERT-base designed to evaluate the Reasoning dimension of text quality on a 5-point scale (0-5). Reasoning measures the complexity of logical thinking and analysis required from readers, focusing on multi-step reasoning, argument structure, and analytical depth.
Model Details
- Base Model: ModernBERT-base
- Parameters: 149M
- Context Window: 4,096 tokens
- Task: Text quality rating (regression)
- Score Range: 0-5 (continuous)
- Performance: 89.59% F1 score, 96.32% accuracy
Rating Scale
The model uses an additive 5-point rating system:
- 0: Contain no reasoning content
- 1: Preliminary reasoning elements with single causal relationships or simple logical judgments, lacking depth
- 2: Basic reasoning with some logical relationships requiring moderate thought, simple argumentative structures
- 3: Good reasoning complexity with multiple steps requiring complex thought, interrelated arguments with some depth
- 4: High reasoning complexity with multi-layered logic and in-depth analysis, comprehensive evaluation required
- 5: Exceptional reasoning complexity demanding deep analysis and innovative thinking, multi-dimensional and interdisciplinary
Usage
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load the model and tokenizer
model_name = "opendatalab/meta-rater-reasoning-rating"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Example text
text = "By analyzing the correlation between economic indicators and social outcomes, we can identify causal mechanisms that explain policy effectiveness..."
# Tokenize and predict
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=4096)
with torch.no_grad():
outputs = model(**inputs)
score = outputs.logits.squeeze().argmax(dim=0)
print(f"Reasoning Score: {score:.2f}")
Training Details
- Training Data: 747,422 examples from SlimPajama dataset
- Annotation Model: Llama-3.3-70B-Instruct
- Training Epochs: 10
- Evaluation Split: 93,428 test examples
- Data Split: 8:1:1 (train:dev:test)
Applications
This model is particularly valuable for:
- Educational content curation for reasoning skill development
- Research paper evaluation and academic content assessment
- Data selection for training reasoning-capable language models
- Curriculum design for critical thinking courses
- Content filtering for analytical and argumentative writing
- Quality assessment of logical and analytical texts
What the Model Evaluates
The model assesses various aspects of reasoning complexity:
- Logical structure and argument coherence
- Multi-step reasoning processes
- Causal relationship identification and analysis
- Evidence integration and synthesis
- Problem-solving approaches and methodologies
- Critical analysis depth and sophistication
Key Reasoning Indicators
- Simple reasoning: Single cause-effect relationships, basic problem-solution pairs
- Moderate reasoning: Multiple interrelated factors, comparative analysis
- Complex reasoning: Multi-layered arguments, comprehensive evaluations
- Advanced reasoning: Interdisciplinary integration, innovative thinking, mathematical models
What the Model Does NOT Consider
- The specific language the text is written in
- The length of the text
- Usage of placeholders for data privacy or safety
- Writing style, grammar, or formatting quality
Limitations
- Designed primarily for English text
- May not capture domain-specific reasoning patterns equally well across all fields
- Performance may vary for highly specialized mathematical or technical reasoning
- Should be combined with other quality metrics for comprehensive assessment
Citation
If you use this model in your research, please cite:
@article{zhuang2025meta,
title={Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models},
author={Zhuang, Xinlin and Peng, Jiahui and Ma, Ren and Wang, Yinfan and Bai, Tianyi and Wei, Xingjian and Qiu, Jiantao and Zhang, Chi and Qian, Ying and He, Conghui},
journal={arXiv preprint arXiv:2504.14194},
year={2025}
}
License
This model is released under the same license as the base ModernBERT model.
Contact
For questions or issues, please contact the authors or open an issue in the repository.
- Downloads last month
- 22
Model tree for opendatalab/meta-rater-reasoning-rating
Base model
answerdotai/ModernBERT-base