MMSkills: Towards Multimodal Skills for General Visual Agents Paper • 2605.13527 • Published 10 days ago • 117
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published 11 days ago • 262
UniPool: A Globally Shared Expert Pool for Mixture-of-Experts Paper • 2605.06665 • Published 17 days ago • 11
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems Paper • 2605.04018 • Published 19 days ago • 40
TCDA: Thread-Constrained Discourse-Aware Modeling for Conversational Sentiment Quadruple Analysis Paper • 2605.01717 • Published 21 days ago • 6
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published 24 days ago • 71
Multiplication in Multimodal LLMs: Computation with Text, Image, and Audio Inputs Paper • 2604.18203 • Published Apr 20 • 6