StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published 3 days ago • 36
StreamVLN: Streaming Vision-and-Language Navigation via SlowFast Context Modeling Paper • 2507.05240 • Published 3 days ago • 36 • 2
GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scenes Paper • 2505.20294 • Published May 26 • 4
NavDP: Learning Sim-to-Real Navigation Diffusion Policy with Privileged Information Guidance Paper • 2505.08712 • Published May 13 • 5 • 2
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Paper • 2409.18125 • Published Sep 26, 2024 • 35
LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness Paper • 2409.18125 • Published Sep 26, 2024 • 35
GRUtopia: Dream General Robots in a City at Scale Paper • 2407.10943 • Published Jul 15, 2024 • 26
GRUtopia: Dream General Robots in a City at Scale Paper • 2407.10943 • Published Jul 15, 2024 • 26 • 2
Running 21 21 Multi-View 3D Visual Grounding 🪑 Display competition-related information and manage submissions
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI Paper • 2312.16170 • Published Dec 26, 2023 • 1