UniVidX: A Unified Multimodal Framework for Versatile Video Generation via Diffusion Priors Paper • 2605.00658 • Published 3 days ago • 62
Bridging Semantic and Kinematic Conditions with Diffusion-based Discrete Motion Tokenizer Paper • 2603.19227 • Published Mar 19 • 42
FantasyVLN: Unified Multimodal Chain-of-Thought Reasoning for Vision-Language Navigation Paper • 2601.13976 • Published Jan 20 • 22
stabilityai/stable-diffusion-3.5-large-controlnet-canny Text-to-Image • Updated Nov 28, 2024 • 17.3k • 15