OmniGen2: Exploration to Advanced Multimodal Generation Paper • 2506.18871 • Published 3 days ago • 65 • 3
Light of Normals: Unified Feature Representation for Universal Photometric Stereo Paper • 2506.18882 • Published 3 days ago • 80 • 3
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding Paper • 2506.16035 • Published 8 days ago • 77 • 7
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents Paper • 2506.11763 • Published 13 days ago • 58 • 4
MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention Paper • 2506.13585 • Published 10 days ago • 240 • 5
Scientists' First Exam: Probing Cognitive Abilities of MLLM via Perception, Understanding, and Reasoning Paper • 2506.10521 • Published 14 days ago • 65 • 4
Emerging Properties in Unified Multimodal Pretraining Paper • 2505.14683 • Published May 20 • 130 • 4