ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs Paper • 2506.18896 • Published 3 days ago • 25
MaxMin-RLHF: Towards Equitable Alignment of Large Language Models with Diverse Human Preferences Paper • 2402.08925 • Published Feb 14, 2024 • 1
TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling Paper • 2410.16033 • Published Oct 18, 2024
Temporal Consistency for LLM Reasoning Process Error Identification Paper • 2503.14495 • Published Mar 18 • 10
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Paper • 2503.24377 • Published Mar 31 • 17
EmoAgent: Assessing and Safeguarding Human-AI Interaction for Mental Health Safety Paper • 2504.09689 • Published Apr 13 • 7
On Path to Multimodal Historical Reasoning: HistBench and HistAgent Paper • 2505.20246 • Published May 26
Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution Paper • 2505.20286 • Published May 26 • 7