DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos Paper • 2409.02095 • Published Sep 3, 2024 • 37
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Paper • 2404.14396 • Published Apr 22, 2024 • 19
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation Paper • 2312.09251 • Published Dec 14, 2023 • 10