Temporal Preference Optimization for Long-Form Video Understanding Paper ⢠2501.13919 ⢠Published Jan 23 ⢠22 ⢠3
IMAGINE-E: Image Generation Intelligence Evaluation of State-of-the-art Text-to-Image Models Paper ⢠2501.13920 ⢠Published Jan 23 ⢠17 ⢠2
OmniThink: Expanding Knowledge Boundaries in Machine Writing through Thinking Paper ⢠2501.09751 ⢠Published Jan 16 ⢠49 ⢠2
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot Paper ⢠2501.09012 ⢠Published Jan 15 ⢠10 ⢠2
MMDocIR: Benchmarking Multi-Modal Retrieval for Long Documents Paper ⢠2501.08828 ⢠Published Jan 15 ⢠32 ⢠2
Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion Paper ⢠2501.09019 ⢠Published Jan 15 ⢠12 ⢠2
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them Paper ⢠2501.08292 ⢠Published Jan 14 ⢠17 ⢠2
VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control Paper ⢠2501.01427 ⢠Published Jan 2 ⢠55 ⢠3