How Well Does GPT-4o Understand Vision? Evaluating Multimodal Foundation Models on Standard Computer Vision Tasks Paper • 2507.01955 • Published 8 days ago • 29
LoRAShop: Training-Free Multi-Concept Image Generation and Editing with Rectified Flow Transformers Paper • 2505.23758 • Published May 29 • 23
RoPECraft: Training-Free Motion Transfer with Trajectory-Guided RoPE Optimization on Diffusion Transformers Paper • 2505.13344 • Published May 19 • 6
SliderSpace: Decomposing the Visual Capabilities of Diffusion Models Paper • 2502.01639 • Published Feb 3 • 26