Learning Video Generation for Robotic Manipulation with Collaborative Trajectory Control Paper • 2506.01943 • Published 25 days ago • 24
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence Paper • 2505.23764 • Published 29 days ago • 4
MMSI-Bench: A Benchmark for Multi-Image Spatial Intelligence Paper • 2505.23764 • Published 29 days ago • 4 • 2
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Paper • 2505.17015 • Published May 22 • 8
Multi-SpatialMLLM: Multi-Frame Spatial Understanding with Multi-Modal Large Language Models Paper • 2505.17015 • Published May 22 • 8 • 2
PointLLM: Empowering Large Language Models to Understand Point Clouds Paper • 2308.16911 • Published Aug 31, 2023 • 1
EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI Paper • 2312.16170 • Published Dec 26, 2023 • 1
Fine-Grained Cross-View Geo-Localization Using a Correlation-Aware Homography Estimator Paper • 2308.16906 • Published Aug 31, 2023