When Does Reasoning Matter? A Controlled Study of Reasoning's Contribution to Model Performance
Abstract
Reasoning models enhance performance across various tasks, surpassing instruction fine-tuned models in reasoning-intensive and open-ended tasks, despite higher computational costs.
Large Language Models (LLMs) with reasoning capabilities have achieved state-of-the-art performance on a wide range of tasks. Despite its empirical success, the tasks and model scales at which reasoning becomes effective, as well as its training and inference costs, remain underexplored. In this work, we rely on a synthetic data distillation framework to conduct a large-scale supervised study. We compare Instruction Fine-Tuning (IFT) and reasoning models of varying sizes, on a wide range of math-centric and general-purpose tasks, evaluating both multiple-choice and open-ended formats. Our analysis reveals that reasoning consistently improves model performance, often matching or surpassing significantly larger IFT systems. Notably, while IFT remains Pareto-optimal in training and inference costs, reasoning models become increasingly valuable as model size scales, overcoming IFT performance limits on reasoning-intensive and open-ended tasks.
Community
⭐ ⭐ ⭐
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- MobileLLM-R1: Exploring the Limits of Sub-Billion Language Model Reasoners with Open Training Recipes (2025)
- Beyond Scaling Law: A Data-Efficient Distillation Framework for Reasoning (2025)
- Long Chain-of-Thought Reasoning Across Languages (2025)
- Apriel-Nemotron-15B-Thinker (2025)
- Demystifying Scientific Problem-Solving in LLMs by Probing Knowledge and Reasoning (2025)
- Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking (2025)
- PromptCoT 2.0: Scaling Prompt Synthesis for Large Language Model Reasoning (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 20
Browse 20 models citing this paperDatasets citing this paper 2
Spaces citing this paper 0
No Space linking this paper