Personalize Anything for Free with Diffusion Transformer Paper • 2503.12590 • Published 12 days ago • 41
FlowTok: Flowing Seamlessly Across Text and Image Tokens Paper • 2503.10772 • Published 15 days ago • 17
Kolmogorov-Arnold Attention: Is Learnable Attention Better For Vision Transformers? Paper • 2503.10632 • Published 15 days ago • 12
PLADIS: Pushing the Limits of Attention in Diffusion Models at Inference Time by Leveraging Sparsity Paper • 2503.07677 • Published 19 days ago • 81
Automated Movie Generation via Multi-Agent CoT Planning Paper • 2503.07314 • Published 18 days ago • 42
ObjectMover: Generative Object Movement with Video Prior Paper • 2503.08037 • Published 18 days ago • 4
One-step Diffusion Models with f-Divergence Distribution Matching Paper • 2502.15681 • Published Feb 21 • 7
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention Paper • 2502.11089 • Published Feb 16 • 150
I Think, Therefore I Diffuse: Enabling Multimodal In-Context Reasoning in Diffusion Models Paper • 2502.10458 • Published Feb 12 • 34
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17 • 33
CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up Paper • 2412.16112 • Published Dec 20, 2024 • 23
Rethinking Large-scale Dataset Compression: Shifting Focus From Labels to Images Paper • 2502.06434 • Published Feb 10 • 1