Decoupled DMD: CFG Augmentation as the Spear, Distribution Matching as the Shield Paper • 2511.22677 • Published 27 days ago • 28
CASA Collection CASA: Cross-Attention as Self-Attention for Efficient Vision-Language Fusion on long context streaming inputs • 6 items • Updated 1 day ago • 4
3D-RE-GEN: 3D Reconstruction of Indoor Scenes with a Generative Framework Paper • 2512.17459 • Published 5 days ago • 7
MeshSplatting: Differentiable Rendering with Opaque Meshes Paper • 2512.06818 • Published 17 days ago • 10
V-RGBX: Video Editing with Accurate Controls over Intrinsic Properties Paper • 2512.11799 • Published 12 days ago • 29
VQRAE: Representation Quantization Autoencoders for Multimodal Understanding, Generation and Reconstruction Paper • 2511.23386 • Published 26 days ago • 15
EgoEdit: Dataset, Real-Time Streaming Model, and Benchmark for Egocentric Video Editing Paper • 2512.06065 • Published 19 days ago • 28
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 27 days ago • 209
Real-time Vision Models Collection A collection of real-time detectors. • 19 items • Updated Nov 23 • 22