Vision as a Dialect: Unifying Visual Understanding and Generation via Text-Aligned Representations Paper • 2506.18898 • Published 3 days ago • 23
Align Your Flow: Scaling Continuous-Time Flow Map Distillation Paper • 2506.14603 • Published 9 days ago • 18
Qwen2.5-Omni Collection End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated May 21 • 145
Sana Collection ⚡️Sana: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer • 21 items • Updated Apr 17 • 92
Flux.1-dev ControlNets Collection A collection of ControlNet models for Flux.1-dev by Jasper Research • 4 items • Updated Sep 24, 2024 • 23