VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks Paper • 2504.05118 • Published 13 days ago • 24
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models Paper • 2504.04718 • Published 14 days ago • 38
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement Paper • 2504.03561 • Published 16 days ago • 17
Concept Lancet: Image Editing with Compositional Representation Transplant Paper • 2504.02828 • Published 17 days ago • 16
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning Paper • 2503.22738 • Published 25 days ago • 15
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay Paper • 2504.03601 • Published 16 days ago • 15
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1) Paper • 2504.03151 • Published 17 days ago • 12
Generative Evaluation of Complex Reasoning in Large Language Models Paper • 2504.02810 • Published 17 days ago • 12
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model Paper • 2504.05594 • Published 13 days ago • 11
MedSAM2: Segment Anything in 3D Medical Images and Videos Paper • 2504.03600 • Published 16 days ago • 8
DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models Paper • 2504.02882 • Published 19 days ago • 6
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning Paper • 2504.05520 • Published 13 days ago • 9
3D Scene Understanding Through Local Random Access Sequence Modeling Paper • 2504.03875 • Published 16 days ago • 5
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking Paper • 2504.03947 • Published 16 days ago • 4
JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model Paper • 2504.03770 • Published 18 days ago • 3
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills Paper • 2504.07079 • Published 11 days ago • 11
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model Paper • 2503.24290 • Published 20 days ago • 61
Understanding R1-Zero-Like Training: A Critical Perspective Paper • 2503.20783 • Published 25 days ago • 43
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy Paper • 2503.24388 • Published 20 days ago • 29
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models Paper • 2503.22165 • Published 24 days ago • 27
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents Paper • 2504.00906 • Published 19 days ago • 20
Effectively Controlling Reasoning Models through Thinking Intervention Paper • 2503.24370 • Published 20 days ago • 18
Expanding RL with Verifiable Rewards Across Diverse Domains Paper • 2503.23829 • Published 21 days ago • 18
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models Paper • 2503.24377 • Published 20 days ago • 17
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models Paper • 2503.22673 • Published 23 days ago • 12
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis Paper • 2502.18924 • Published Feb 26 • 12
Interpreting Emergent Planning in Model-Free Reinforcement Learning Paper • 2504.01871 • Published 18 days ago • 11
Token-Efficient Long Video Understanding for Multimodal LLMs Paper • 2503.04130 • Published Mar 6 • 93
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion Paper • 2503.11576 • Published Mar 14 • 96
SmolVLM: Redefining small and efficient multimodal models Paper • 2504.05299 • Published 13 days ago • 163
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models Paper • 2503.08120 • Published Mar 11 • 31
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey Paper • 2503.12605 • Published Mar 16 • 34
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster Paper • 2503.09662 • Published Mar 12 • 33
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning Paper • 2503.10291 • Published Mar 13 • 34
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity Paper • 2503.16418 • Published Mar 20 • 35
Modifying Large Language Model Post-Training for Diverse Creative Writing Paper • 2503.17126 • Published about 1 month ago • 36
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation Paper • 2503.22675 • Published 23 days ago • 34
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice Paper • 2503.05978 • Published Mar 7 • 35
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs Paper • 2503.01307 • Published Mar 3 • 37
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse Paper • 2503.16365 • Published Mar 20 • 39
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking Paper • 2502.20730 • Published Feb 28 • 39
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond Paper • 2503.21614 • Published 24 days ago • 39