Training Step-Level Reasoning Verifiers with Formal Verification Tools
Paper
•
2505.15960
•
Published
•
7
Process Reward Models (PRMs) trained on step-level error labels automatically annotated by formal verification tools.