Papers - a HariharaIII Collection

HariharaIII 's Collections

Papers

Papers

updated 11 days ago

VAPO: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks

Paper • 2504.05118 • Published 13 days ago • 24
T1: Tool-integrated Self-verification for Test-time Compute Scaling in Small Language Models

Paper • 2504.04718 • Published 14 days ago • 38
SynWorld: Virtual Scenario Synthesis for Agentic Action Knowledge Refinement

Paper • 2504.03561 • Published 16 days ago • 17
Concept Lancet: Image Editing with Compositional Representation Transplant

Paper • 2504.02828 • Published 17 days ago • 16
ShieldAgent: Shielding Agents via Verifiable Safety Policy Reasoning

Paper • 2503.22738 • Published 25 days ago • 15
APIGen-MT: Agentic Pipeline for Multi-Turn Data Generation via Simulated Agent-Human Interplay

Paper • 2504.03601 • Published 16 days ago • 15
LiveVQA: Live Visual Knowledge Seeking

Paper • 2504.05288 • Published 13 days ago • 13
Why Reasoning Matters? A Survey of Advancements in Multimodal Reasoning (v1)

Paper • 2504.03151 • Published 17 days ago • 12
Generative Evaluation of Complex Reasoning in Large Language Models

Paper • 2504.02810 • Published 17 days ago • 12
Tuning-Free Image Editing with Fidelity and Editability via Unified Latent Diffusion Model

Paper • 2504.05594 • Published 13 days ago • 11
MedSAM2: Segment Anything in 3D Medical Images and Videos

Paper • 2504.03600 • Published 16 days ago • 8
DiaTool-DPO: Multi-Turn Direct Preference Optimization for Tool-Augmented Large Language Models

Paper • 2504.02882 • Published 19 days ago • 6
Efficient Reinforcement Finetuning via Adaptive Curriculum Learning

Paper • 2504.05520 • Published 13 days ago • 9
3D Scene Understanding Through Local Random Access Sequence Modeling

Paper • 2504.03875 • Published 16 days ago • 5
Distillation and Refinement of Reasoning in Small Language Models for Document Re-ranking

Paper • 2504.03947 • Published 16 days ago • 4
JailDAM: Jailbreak Detection with Adaptive Memory for Vision-Language Model

Paper • 2504.03770 • Published 18 days ago • 3
SkillWeaver: Web Agents can Self-Improve by Discovering and Honing Skills

Paper • 2504.07079 • Published 11 days ago • 11
Rethinking Reflection in Pre-Training

Paper • 2504.04022 • Published 16 days ago • 75
Open-Reasoner-Zero: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model

Paper • 2503.24290 • Published 20 days ago • 61
Understanding R1-Zero-Like Training: A Critical Perspective

Paper • 2503.20783 • Published 25 days ago • 43
RIG: Synergizing Reasoning and Imagination in End-to-End Generalist Policy

Paper • 2503.24388 • Published 20 days ago • 29
Agentic Knowledgeable Self-awareness

Paper • 2504.03553 • Published 16 days ago • 27
Landscape of Thoughts: Visualizing the Reasoning Process of Large Language Models

Paper • 2503.22165 • Published 24 days ago • 27
Agent S2: A Compositional Generalist-Specialist Framework for Computer Use Agents

Paper • 2504.00906 • Published 19 days ago • 20
Effectively Controlling Reasoning Models through Thinking Intervention

Paper • 2503.24370 • Published 20 days ago • 18
Expanding RL with Verifiable Rewards Across Diverse Domains

Paper • 2503.23829 • Published 21 days ago • 18
Harnessing the Reasoning Economy: A Survey of Efficient Reasoning for Large Language Models

Paper • 2503.24377 • Published 20 days ago • 17
ActionStudio: A Lightweight Framework for Data and Training of Large Action Models

Paper • 2503.22673 • Published 23 days ago • 12
MegaTTS 3: Sparse Alignment Enhanced Latent Diffusion Transformer for Zero-Shot Speech Synthesis

Paper • 2502.18924 • Published Feb 26 • 12
Interpreting Emergent Planning in Model-Free Reinforcement Learning

Paper • 2504.01871 • Published 18 days ago • 11
START: Self-taught Reasoner with Tools

Paper • 2503.04625 • Published Mar 6 • 108
Token-Efficient Long Video Understanding for Multimodal LLMs

Paper • 2503.04130 • Published Mar 6 • 93
SmolDocling: An ultra-compact vision-language model for end-to-end multi-modal document conversion

Paper • 2503.11576 • Published Mar 14 • 96
SmolVLM: Redefining small and efficient multimodal models

Paper • 2504.05299 • Published 13 days ago • 163
UniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models

Paper • 2503.08120 • Published Mar 11 • 31
Multimodal Chain-of-Thought Reasoning: A Comprehensive Survey

Paper • 2503.12605 • Published Mar 16 • 34
CoRe^2: Collect, Reflect and Refine to Generate Better and Faster

Paper • 2503.09662 • Published Mar 12 • 33
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning

Paper • 2503.10291 • Published Mar 13 • 34
InfiniteYou: Flexible Photo Recrafting While Preserving Your Identity

Paper • 2503.16418 • Published Mar 20 • 35
Modifying Large Language Model Post-Training for Diverse Creative Writing

Paper • 2503.17126 • Published about 1 month ago • 36
Think Before Recommend: Unleashing the Latent Reasoning Power for Sequential Recommendation

Paper • 2503.22675 • Published 23 days ago • 34
MagicInfinite: Generating Infinite Talking Videos with Your Words and Voice

Paper • 2503.05978 • Published Mar 7 • 35
API Agents vs. GUI Agents: Divergence and Convergence

Paper • 2503.11069 • Published Mar 14 • 35
Cognitive Behaviors that Enable Self-Improving Reasoners, or, Four Habits of Highly Effective STaRs

Paper • 2503.01307 • Published Mar 3 • 37
JARVIS-VLA: Post-Training Large-Scale Vision Language Models to Play Visual Games with Keyboards and Mouse

Paper • 2503.16365 • Published Mar 20 • 39
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking

Paper • 2502.20730 • Published Feb 28 • 39
Process-based Self-Rewarding Language Models

Paper • 2503.03746 • Published Mar 5 • 39
A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond

Paper • 2503.21614 • Published 24 days ago • 39
EgoLife: Towards Egocentric Life Assistant

Paper • 2503.03803 • Published Mar 5 • 41