ShareGPT-4o-Image: Aligning Multimodal Models with GPT-4o-Level Image Generation Paper • 2506.18095 • Published 4 days ago • 53
Guidance in the Frequency Domain Enables High-Fidelity Sampling at Low CFG Scales Paper • 2506.19713 • Published 2 days ago • 11
SimpleGVR: A Simple Baseline for Latent-Cascaded Video Super-Resolution Paper • 2506.19838 • Published 2 days ago • 10
ScaleCap: Inference-Time Scalable Image Captioning via Dual-Modality Debiasing Paper • 2506.19848 • Published 2 days ago • 24
AnimaX: Animating the Inanimate in 3D with Joint Video-Pose Diffusion Models Paper • 2506.19851 • Published 2 days ago • 48
OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published 3 days ago • 65
VMem: Consistent Interactive Video Scene Generation with Surfel-Indexed View Memory Paper • 2506.18903 • Published 3 days ago • 16
Auto-Regressively Generating Multi-View Consistent Images Paper • 2506.18527 • Published 3 days ago • 6
ImmerseGen: Agent-Guided Immersive World Generation with Alpha-Textured Proxies Paper • 2506.14315 • Published 9 days ago • 10
Hunyuan3D 2.5: Towards High-Fidelity 3D Assets Generation with Ultimate Details Paper • 2506.16504 • Published 7 days ago • 19
Hunyuan-GameCraft: High-dynamic Interactive Game Video Generation with Hybrid History Condition Paper • 2506.17201 • Published 6 days ago • 39
CLEAR: Character Unlearning in Textual and Visual Modalities Paper • 2410.18057 • Published Oct 23, 2024 • 210