StereoPilot: Learning Unified and Efficient Stereo Conversion via Generative Priors Paper • 2512.16915 • Published 6 days ago • 37
Wan-Move: Motion-controllable Video Generation via Latent Trajectory Guidance Paper • 2512.08765 • Published 15 days ago • 125
SimScale: Learning to Drive via Real-World Simulation at Scale Paper • 2511.23369 • Published 26 days ago • 37
DualCamCtrl: Dual-Branch Diffusion Model for Geometry-Aware Camera-Controlled Video Generation Paper • 2511.23127 • Published 26 days ago • 43
Self-Forcing++: Towards Minute-Scale High-Quality Video Generation Paper • 2510.02283 • Published Oct 2 • 95
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published Oct 13 • 176
DC-VideoGen: Efficient Video Generation with Deep Compression Video Autoencoder Paper • 2509.25182 • Published Sep 29 • 37
SANA-Video: Efficient Video Generation with Block Linear Diffusion Transformer Paper • 2509.24695 • Published Sep 29 • 44
Video-MTR: Reinforced Multi-Turn Reasoning for Long Video Understanding Paper • 2508.20478 • Published Aug 28 • 17
Long-Context Autoregressive Video Modeling with Next-Frame Prediction Paper • 2503.19325 • Published Mar 25 • 73
DART: Denoising Autoregressive Transformer for Scalable Text-to-Image Generation Paper • 2410.08159 • Published Oct 10, 2024 • 26
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens Paper • 2410.13863 • Published Oct 17, 2024 • 37
PixArt-Alpha Collection This collection organize all the PixArt-Alpha related models, datasets and so on. • 9 items • Updated May 4, 2024 • 5
SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation Paper • 2404.14396 • Published Apr 22, 2024 • 19
SEED-Story: Multimodal Long Story Generation with Large Language Model Paper • 2407.08683 • Published Jul 11, 2024 • 24