LOVE-R1: Advancing Long Video Understanding with an Adaptive Zoom-in Mechanism via Multi-Step Reasoning Paper • 2509.24786 • Published 14 days ago • 5
HumanOmniV2: From Understanding to Omni-Modal Reasoning with Context Paper • 2506.21277 • Published Jun 26 • 15
PoSynDA: Multi-Hypothesis Pose Synthesis Domain Adaptation for Robust 3D Human Pose Estimation Paper • 2308.09678 • Published Aug 18, 2023
Frozen-DETR: Enhancing DETR with Image Understanding from Frozen Foundation Models Paper • 2410.19635 • Published Oct 25, 2024
LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models Paper • 2501.18954 • Published Jan 31