ViLBench: A Suite for Vision-Language Process Reward Modeling Paper • 2503.20271 • Published 5 days ago • 6
ViLBench: A Suite for Vision-Language Process Reward Modeling Paper • 2503.20271 • Published 5 days ago • 6
"Principal Components" Enable A New Language of Images Paper • 2503.08685 • Published 19 days ago • 11
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning Paper • 2503.06960 • Published 21 days ago • 3
Scaling Laws in Patchification: An Image Is Worth 50,176 Tokens And More Paper • 2502.03738 • Published Feb 6 • 11
Libra-Leaderboard: Towards Responsible AI through a Balanced Leaderboard of Safety and Capability Paper • 2412.18551 • Published Dec 24, 2024
Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights Paper • 2405.21070 • Published May 31, 2024
CLIPS: An Enhanced CLIP Framework for Learning with Synthetic Captions Paper • 2411.16828 • Published Nov 25, 2024 • 1