OmniInsert: Mask-Free Video Insertion of Any Reference via Diffusion Transformer Models Paper • 2509.17627 • Published 19 days ago • 63
HuMo: Human-Centric Video Generation via Collaborative Multi-Modal Conditioning Paper • 2509.08519 • Published Sep 10 • 125
UMO: Scaling Multi-Identity Consistency for Image Customization via Matching Reward Paper • 2509.06818 • Published Sep 8 • 29
USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning Paper • 2508.18966 • Published Aug 26 • 56
DreamID: High-Fidelity and Fast diffusion-based Face Swapping via Triplet ID Group Learning Paper • 2504.14509 • Published Apr 20 • 50
Less-to-More Generalization: Unlocking More Controllability by In-Context Generation Paper • 2504.02160 • Published Apr 2 • 37
Seedream 2.0: A Native Chinese-English Bilingual Image Generation Foundation Model Paper • 2503.07703 • Published Mar 10 • 37
VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control Paper • 2412.20800 • Published Dec 30, 2024 • 11