MPS-Prover: Advancing Stepwise Theorem Proving by Multi-Perspective Search and Data Curation
Abstract
Automated Theorem Proving (ATP) in formal languages remains a formidable challenge in AI, demanding rigorous logical deduction and navigating vast search spaces. While large language models (LLMs) have shown promising performance, existing stepwise provers often suffer from biased search guidance, leading to inefficiencies and suboptimal proof strategies. This paper introduces the Multi-Perspective Search Prover (MPS-Prover), a novel stepwise ATP system designed to overcome these limitations. MPS-Prover incorporates two key innovations: a highly effective post-training data curation strategy that prunes approximately 40% of redundant training data without sacrificing performance, and a multi-perspective tree search mechanism. This search integrates a learned critic model with strategically designed heuristic rules to diversify tactic selection, prevent getting trapped in unproductive states, and enhance search robustness. Extensive evaluations demonstrate that MPS-Prover achieves state-of-the-art performance on multiple challenging benchmarks, including miniF2F and ProofNet, outperforming prior 7B parameter models. Furthermore, our analyses reveal that MPS-Prover generates significantly shorter and more diverse proofs compared to existing stepwise and whole-proof methods, highlighting its efficiency and efficacy. Our work advances the capabilities of LLM-based formal reasoning and offers a robust framework and a comprehensive analysis for developing more powerful theorem provers.
Community
This paper introduces the Multi-Perspective Search Prover
(MPS-Prover), a novel stepwise ATP system designed to overcome these limitations.
MPS-Prover incorporates two key innovations: a highly effective post-training data
curation strategy that prunes approximately 40% of redundant training data without
sacrificing performance, and a multi-perspective tree search mechanism. MPS-Prover achieves state-of-the-art
performance on multiple challenging benchmarks, including miniF2F and ProofNet,
outperforming prior 7B parameter models
This is an automated message from the Librarian Bot. I found the following papers similar to this paper.
The following papers were recommended by the Semantic Scholar API
- Kimina-Prover Preview: Towards Large Formal Reasoning Models with Reinforcement Learning (2025)
- Leanabell-Prover: Posttraining Scaling in Formal Reasoning (2025)
- Neural Theorem Proving: Generating and Structuring Proofs for Formal Verification (2025)
- APOLLO: Automated LLM and Lean Collaboration for Advanced Formal Reasoning (2025)
- FormalMATH: Benchmarking Formal Mathematical Reasoning of Large Language Models (2025)
- Hierarchical Attention Generates Better Proofs (2025)
- LogicTree: Structured Proof Exploration for Coherent and Rigorous Logical Reasoning with Large Language Models (2025)
Please give a thumbs up to this comment if you found it helpful!
If you want recommendations for any Paper on Hugging Face checkout this Space
You can directly ask Librarian Bot for paper recommendations by tagging it in a comment:
@librarian-bot
recommend
Models citing this paper 0
No model linking this paper
Datasets citing this paper 0
No dataset linking this paper
Spaces citing this paper 0
No Space linking this paper
Collections including this paper 0
No Collection including this paper