Evaluation is All You Need: Strategic Overclaiming of LLM Reasoning Capabilities Through Evaluation Design Paper • 2506.04734 • Published Jun 5 • 19
BLIP3-o: A Family of Fully Open Unified Multimodal Models-Architecture, Training and Dataset Paper • 2505.09568 • Published May 14 • 95
microsoft/table-transformer-structure-recognition Object Detection • 0.0B • Updated Sep 6, 2023 • 657k • 195
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 132
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? Paper • 2502.15657 • Published Feb 21 • 5
Expanding Performance Boundaries of Open-Source Multimodal Models with Model, Data, and Test-Time Scaling Paper • 2412.05271 • Published Dec 6, 2024 • 160