ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published 17 days ago • 28
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models Paper • 2501.03124 • Published Jan 6 • 14
Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs Paper • 2505.11227 • Published May 16