AbstentionBench: Reasoning LLMs Fail on Unanswerable Questions Paper • 2506.09038 • Published 11 days ago • 6 • 2
COMPACT: COMPositional Atomic-to-Complex Visual Capability Tuning Paper • 2504.21850 • Published Apr 30 • 26
Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks Paper • 2310.19909 • Published Oct 30, 2023 • 21
Does Progress On Object Recognition Benchmarks Improve Real-World Generalization? Paper • 2307.13136 • Published Jul 24, 2023 • 1
PUG: Photorealistic and Semantically Controllable Synthetic Data for Representation Learning Paper • 2308.03977 • Published Aug 8, 2023
A Whac-A-Mole Dilemma: Shortcuts Come in Multiples Where Mitigating One Amplifies Others Paper • 2212.04825 • Published Dec 9, 2022
UniBench: Visual Reasoning Requires Rethinking Vision-Language Beyond Scaling Paper • 2408.04810 • Published Aug 9, 2024 • 25