SkyReels-Audio: Omni Audio-Conditioned Talking Portraits in Video Diffusion Transformers Paper • 2506.00830 • Published 11 days ago • 7
SeedVR2: One-Step Video Restoration via Diffusion Adversarial Post-Training Paper • 2506.05301 • Published 7 days ago • 56
Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models Paper • 2506.05176 • Published 7 days ago • 55
STARFlow: Scaling Latent Normalizing Flows for High-resolution Image Synthesis Paper • 2506.06276 • Published 6 days ago • 18
Teller: Real-Time Streaming Audio-Driven Portrait Animation with Autoregressive Motion Generation Paper • 2503.18429 • Published Mar 24 • 2
OmniTalker: Real-Time Text-Driven Talking Head Generation with In-Context Audio-Visual Style Replication Paper • 2504.02433 • Published Apr 3 • 1
RelationAdapter: Learning and Transferring Visual Relation with Diffusion Transformers Paper • 2506.02528 • Published 9 days ago • 15
FlowMo: Variance-Based Flow Guidance for Coherent Motion in Video Generation Paper • 2506.01144 • Published 11 days ago • 14
AnimeShooter: A Multi-Shot Animation Dataset for Reference-Guided Video Generation Paper • 2506.03126 • Published 9 days ago • 22
TalkingMachines: Real-Time Audio-Driven FaceTime-Style Video via Autoregressive Diffusion Models Paper • 2506.03099 • Published 9 days ago • 8
LayerFlow: A Unified Model for Layer-aware Video Generation Paper • 2506.04228 • Published 8 days ago • 13
OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data Paper • 2505.18445 • Published 19 days ago • 63
Alchemist: Turning Public Text-to-Image Data into Generative Gold Paper • 2505.19297 • Published 18 days ago • 74
Training-Free Efficient Video Generation via Dynamic Token Carving Paper • 2505.16864 • Published 21 days ago • 21
Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities Paper • 2505.02567 • Published May 5 • 75
Diffusion vs. Autoregressive Language Models: A Text Embedding Perspective Paper • 2505.15045 • Published 22 days ago • 54
LightLab: Controlling Light Sources in Images with Diffusion Models Paper • 2505.09608 • Published 29 days ago • 31
GIE-Bench: Towards Grounded Evaluation for Text-Guided Image Editing Paper • 2505.11493 • Published 27 days ago • 3