VideoAuteur: Towards Long Narrative Video Generation Paper • 2501.06173 • Published 8 days ago • 30
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps Paper • 2501.09732 • Published 1 day ago • 40
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper • 2501.09019 • Published 3 days ago • 10
Running on Zero 15 💻 Newborn Article Impact Predict Use title and abstract to predict future academic impact
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 35
Identity-Preserving Text-to-Video Generation by Frequency Decomposition Paper • 2411.17440 • Published Nov 26, 2024 • 35
FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity Paper • 2411.15411 • Published Nov 23, 2024 • 7
VLRewardBench: A Challenging Benchmark for Vision-Language Generative Reward Models Paper • 2411.17451 • Published Nov 26, 2024 • 10
Autoregressive Models in Vision: A Survey Paper • 2411.05902 • Published Nov 8, 2024 • 17 • 2
MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models Paper • 2410.10139 • Published Oct 14, 2024 • 51