OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published 5 days ago • 67
VideoDeepResearch: Long Video Understanding With Agentic Tool Using Paper • 2506.10821 • Published 16 days ago • 19
EVEv2: Improved Baselines for Encoder-Free Vision-Language Models Paper • 2502.06788 • Published Feb 10 • 13
X2I Dataset Collection Datasets used in OmniGen-v1. (v2 is coming soon :) ) • 5 items • Updated Apr 28 • 18
MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval Paper • 2412.14475 • Published Dec 19, 2024 • 55