DeepCritic: Deliberate Critique with Large Language Models Paper • 2505.00662 • Published 11 days ago • 48
WebThinker: Empowering Large Reasoning Models with Deep Research Capability Paper • 2504.21776 • Published 12 days ago • 44
Reinforcement Learning for Reasoning in Large Language Models with One Training Example Paper • 2504.20571 • Published 13 days ago • 90
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Paper • 2504.20157 • Published 13 days ago • 35
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Paper • 2504.20157 • Published 13 days ago • 35 • 7
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Paper • 2504.20157 • Published 13 days ago • 35 • 7
Toward Evaluative Thinking: Meta Policy Optimization with Evolving Reward Models Paper • 2504.20157 • Published 13 days ago • 35 • 7
Skywork R1V2: Multimodal Hybrid Reinforcement Learning for Reasoning Paper • 2504.16656 • Published 19 days ago • 55
I-Con: A Unifying Framework for Representation Learning Paper • 2504.16929 • Published 18 days ago • 30
Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Paper • 2504.15271 • Published 20 days ago • 65
MIG: Automatic Data Selection for Instruction Tuning by Maximizing Information Gain in Semantic Space Paper • 2504.13835 • Published 23 days ago • 36
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published 23 days ago • 121
CLIMB: CLustering-based Iterative Data Mixture Bootstrapping for Language Model Pre-training Paper • 2504.13161 • Published 24 days ago • 88
Efficient Hybrid Language Model Compression through Group-Aware SSM Pruning Paper • 2504.11409 • Published 27 days ago • 10
VisualPuzzles: Decoupling Multimodal Reasoning Evaluation from Domain Knowledge Paper • 2504.10342 • Published 28 days ago • 11
The Scalability of Simplicity: Empirical Analysis of Vision-Language Learning with a Single Transformer Paper • 2504.10462 • Published 28 days ago • 15
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations Paper • 2504.10481 • Published 27 days ago • 84