Unleashing Vecset Diffusion Model for Fast Shape Generation Paper • 2503.16302 • Published 2 days ago • 32
Unposed Sparse Views Room Layout Reconstruction in the Age of Pretrain Model Paper • 2502.16779 • Published 26 days ago • 2
Reflective Planning: Vision-Language Models for Multi-Stage Long-Horizon Robotic Manipulation Paper • 2502.16707 • Published 27 days ago • 11
Chimera: Improving Generalist Model with Domain-Specific Experts Paper • 2412.05983 • Published Dec 8, 2024 • 9
AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information? Paper • 2412.02611 • Published Dec 3, 2024 • 24
Remember, Retrieve and Generate: Understanding Infinite Visual Concepts as Your Personalized Assistant Paper • 2410.13360 • Published Oct 17, 2024 • 9
Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations Paper • 2410.08049 • Published Oct 10, 2024 • 8