A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond
Abstract
Recent Large Reasoning Models (LRMs), such as DeepSeek-R1 and OpenAI o1, have demonstrated strong performance gains by scaling up the length of Chain-of-Thought (CoT) reasoning during inference. However, a growing concern lies in their tendency to produce excessively long reasoning traces, which are often filled with redundant content (e.g., repeated definitions), over-analysis of simple problems, and superficial exploration of multiple reasoning paths for harder tasks. This inefficiency introduces significant challenges for training, inference, and real-world deployment (e.g., in agent-based systems), where token economy is critical. In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm. We identify common patterns of inefficiency, examine methods proposed across the LRM lifecycle, i.e., from pretraining to inference, and discuss promising future directions for research. To support ongoing development, we also maintain a real-time GitHub repository tracking recent progress in the field. We hope this survey serves as a foundation for further exploration and inspires innovation in this rapidly evolving area.
Community
In this survey, we provide a comprehensive overview of recent efforts aimed at improving reasoning efficiency in LRMs, with a particular focus on the unique challenges that arise in this new paradigm.
We provide a definition of reasoning efficiency, identify and characterize common patterns of reasoning inefficiency, and outline the current challenges that are unique to improving reasoning efficiency in large models.
We provide a comprehensive review of recent advancements aimed at enhancing reasoning
efficiency, structured across the end-to-end LRM development pipeline, from pretraining
and supervised fine-tuning to reinforcement learning and inference.
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Stop Overthinking: A Survey on Efficient Reasoning for Large Language Models (2025)
- SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs (2025)
- Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching (2025)
- Boosting Multimodal Reasoning with MCTS-Automated Structured Thinking (2025)
- Towards Widening The Distillation Bottleneck for Reasoning Models (2025)
- Policy Guided Tree Search for Enhanced LLM Reasoning (2025)
- Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Interesting work!
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper