SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation Paper • 2502.13128 • Published 3 days ago • 31
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 7 days ago • 49
Stable Flow: Vital Layers for Training-Free Image Editing Paper • 2411.14430 • Published Nov 21, 2024 • 22
OmniHuman-1: Rethinking the Scaling-Up of One-Stage Conditioned Human Animation Models Paper • 2502.01061 • Published 18 days ago • 179
Streaming DiLoCo with overlapping communication: Towards a Distributed Free Lunch Paper • 2501.18512 • Published 22 days ago • 27
[MASK] is All You Need Collection Code, dataset, and pretrained model • 6 items • Updated 15 days ago • 9
PaliGemma 2 Release Collection Vision-Language Models available in multiple 3B, 10B and 28B variants. • 23 items • Updated Dec 13, 2024 • 141
FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait Paper • 2412.01064 • Published Dec 2, 2024 • 26
Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations Paper • 2410.10792 • Published Oct 14, 2024 • 30
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models Paper • 2410.02416 • Published Oct 3, 2024 • 26
Loradex Highlights Collection This collection features awesome opensource LoRAs trained by members of the Glif Community during Loradex Early Access! • 14 items • Updated Oct 18, 2024 • 19