Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published 4 days ago • 19
AGENTIF: Benchmarking Instruction Following of Large Language Models in Agentic Scenarios Paper • 2505.16944 • Published 2 days ago • 6
Training Step-Level Reasoning Verifiers with Formal Verification Tools Paper • 2505.15960 • Published 3 days ago • 6
The Unreasonable Effectiveness of Entropy Minimization in LLM Reasoning Paper • 2505.15134 • Published 4 days ago • 3
General-Reasoner: Advancing LLM Reasoning Across All Domains Paper • 2505.14652 • Published 4 days ago • 17
Fine-tuning Quantized Neural Networks with Zeroth-order Optimization Paper • 2505.13430 • Published 5 days ago • 10
Two Experts Are All You Need for Steering Thinking: Reinforcing Cognitive Effort in MoE Reasoning Models Without Additional Training Paper • 2505.14681 • Published 4 days ago • 9