3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination Paper • 2406.05132 • Published Jun 7, 2024 • 28
LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent Paper • 2309.12311 • Published Sep 21, 2023 • 17
Sound Localization from Motion: Jointly Learning Sound Direction and Camera Rotation Paper • 2303.11329 • Published Mar 20, 2023 • 1
Understanding 3D Object Interaction from a Single Image Paper • 2305.09664 • Published May 16, 2023 • 1