IV-Bench: A Benchmark for Image-Grounded Video Perception and Reasoning in Multimodal LLMs Paper • 2504.15415 • Published Apr 21 • 22
Can MLLMs Understand the Deep Implication Behind Chinese Images? Paper • 2410.13854 • Published Oct 17, 2024 • 12