Double-Checker: Enhancing Reasoning of Slow-Thinking LLMs via Self-Critical Fine-Tuning Paper • 2506.21285 • Published 24 days ago
Teaching LLMs According to Their Aptitude: Adaptive Reasoning for Mathematical Problem Solving Paper • 2502.12022 • Published Feb 17
SVGenius: Benchmarking LLMs in SVG Understanding, Editing and Generation Paper • 2506.03139 • Published Jun 3 • 14
Do Large Language Models Excel in Complex Logical Reasoning with Formal Language? Paper • 2505.16998 • Published May 22 • 2
Triad: A Framework Leveraging a Multi-Role LLM-based Agent to Solve Knowledge Base Question Answering Paper • 2402.14320 • Published Feb 22, 2024
ViewSpatial-Bench: Evaluating Multi-perspective Spatial Localization in Vision-Language Models Paper • 2505.21500 • Published May 27 • 12
UGPhysics: A Comprehensive Benchmark for Undergraduate Physics Reasoning with Large Language Models Paper • 2502.00334 • Published Feb 1
AskToAct: Enhancing LLMs Tool Use via Self-Correcting Clarification Paper • 2503.01940 • Published Mar 3
Let LLMs Break Free from Overthinking via Self-Braking Tuning Paper • 2505.14604 • Published May 20 • 23
Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning Paper • 2505.14684 • Published May 20 • 23
VerifyBench: Benchmarking Reference-based Reward Systems for Large Language Models Paper • 2505.15801 • Published May 21 • 17
MathFimer: Enhancing Mathematical Reasoning by Expanding Reasoning Steps through Fill-in-the-Middle Task Paper • 2502.11684 • Published Feb 17
S$^3$c-Math: Spontaneous Step-level Self-correction Makes Large Language Models Better Mathematical Reasoners Paper • 2409.01524 • Published Sep 3, 2024 • 1
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning Paper • 2409.12929 • Published Sep 19, 2024
InftyThink: Breaking the Length Limits of Long-Context Reasoning in Large Language Models Paper • 2503.06692 • Published Mar 9 • 2