Localizing Moments in Long Video Via Multimodal Guidance Paper • 2302.13372 • Published Feb 26, 2023 • 1
MoDA: Modulation Adapter for Fine-Grained Visual Grounding in Instructional MLLMs Paper • 2506.01850 • Published 12 days ago • 1