Temporal Preference Optimization for Long-Form Video Understanding Paper • 2501.13919 • Published 7 days ago • 21 • 3
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper • 2501.13920 • Published 7 days ago • 13 • 2
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper • 2501.09751 • Published 14 days ago • 47 • 2
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper • 2501.09012 • Published 15 days ago • 10 • 2
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper • 2501.08828 • Published 15 days ago • 30 • 2
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper • 2501.09019 • Published 15 days ago • 12 • 2
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper • 2501.08292 • Published 16 days ago • 17 • 2
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper • 2501.01427 • Published 28 days ago • 49 • 3