view article Article Introducing Modular Diffusers - Composable Building Blocks for Diffusion Pipelines +2 8 days ago • 31
LTX-2.3 Collection LTX-2.3 base models, quantized models and accompanying LoRAs and IC-LoRAs • 5 items • Updated 3 days ago • 18
JavisDiT-v1.0 Collection Unified Modeling and Optimization for Joint Audio-Video Generation • 2 items • Updated 15 days ago • 1
JavisDiT++: Unified Modeling and Optimization for Joint Audio-Video Generation Paper • 2602.19163 • Published 18 days ago • 14
SkyReels-V4: Multi-modal Video-Audio Generation, Inpainting and Editing model Paper • 2602.21818 • Published 15 days ago • 52
DreamID-Omni: Unified Framework for Controllable Human-Centric Audio-Video Generation Paper • 2602.12160 • Published 28 days ago • 38
BitDance: Scaling Autoregressive Generative Models with Binary Tokens Paper • 2602.14041 • Published 25 days ago • 52
BitDance Collection BitDance: Open-source autoregressive model with binary visual tokens. A research project for building powerful multimodal autoregressive model. • 10 items • Updated 10 days ago • 11
JUST-DUB-IT: Video Dubbing via Joint Audio-Visual Diffusion Paper • 2601.22143 • Published Jan 29 • 7
Scaling Text-to-Image Diffusion Transformers with Representation Autoencoders Paper • 2601.16208 • Published Jan 22 • 54
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer Paper • 2601.16515 • Published Jan 23 • 15
Memory-V2V: Augmenting Video-to-Video Diffusion Models with Memory Paper • 2601.16296 • Published Jan 22 • 28
Skywork UniPic 2.0: Building Kontext Model with Online RL for Unified Multimodal Model Paper • 2509.04548 • Published Sep 4, 2025 • 5
Skywork-Unipic3 Collection Unified Multi-Image Composition with Sequence Modeling • 9 items • Updated 10 days ago • 12