Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding Paper • 2509.15178 • Published 18 days ago • 5
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding Paper • 2509.15178 • Published 18 days ago • 5 • 2
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29 • 128
HunyuanWorld 1.0: Generating Immersive, Explorable, and Interactive 3D Worlds from Words or Pixels Paper • 2507.21809 • Published Jul 29 • 128
Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy Paper • 2506.22432 • Published Jun 27 • 13 • 1
Shape-for-Motion: Precise and Consistent Video Editing with 3D Proxy Paper • 2506.22432 • Published Jun 27 • 13