Latent Diffusion Model without Variational Autoencoder Paper • 2510.15301 • Published 4 days ago • 30
NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks Paper • 2510.15019 • Published 4 days ago • 50
A Theoretical Study on Bridging Internal Probability and Self-Consistency for LLM Reasoning Paper • 2510.15444 • Published 4 days ago • 112
Scaling Instruction-Based Video Editing with a High-Quality Synthetic Dataset Paper • 2510.15742 • Published 3 days ago • 37
ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints Paper • 2510.14847 • Published 4 days ago • 50
WithAnyone: Towards Controllable and ID Consistent Image Generation Paper • 2510.14975 • Published 4 days ago • 74
FlashWorld: High-quality 3D Scene Generation within Seconds Paper • 2510.13678 • Published 5 days ago • 64
UniMoE-Audio: Unified Speech and Music Generation with Dynamic-Capacity MoE Paper • 2510.13344 • Published 6 days ago • 59
UniFusion: Vision-Language Model as Unified Encoder in Image Generation Paper • 2510.12789 • Published 6 days ago • 15
FlashVSR: Towards Real-Time Diffusion-Based Streaming Video Super-Resolution Paper • 2510.12747 • Published 6 days ago • 33
Spatial Forcing: Implicit Spatial Representation Alignment for Vision-language-action Model Paper • 2510.12276 • Published 7 days ago • 139
BrowserAgent: Building Web Agents with Human-Inspired Web Browsing Actions Paper • 2510.10666 • Published 8 days ago • 27
Diffusion Transformers with Representation Autoencoders Paper • 2510.11690 • Published 7 days ago • 152
QeRL: Beyond Efficiency -- Quantization-enhanced Reinforcement Learning for LLMs Paper • 2510.11696 • Published 7 days ago • 162