MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization Paper • 2503.16874 • Published Mar 21 • 44
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning Paper • 2505.23380 • Published 13 days ago • 23
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning Paper • 2505.23754 • Published 13 days ago • 15
Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence Paper • 2505.20325 • Published 19 days ago • 45
I Have Covered All the Bases Here: Interpreting Reasoning Features in Large Language Models via Sparse Autoencoders Paper • 2503.18878 • Published Mar 24 • 118
Pre-trained Large Language Models Learn Hidden Markov Models In-context Paper • 2506.07298 • Published 2 days ago • 17