Hidden in plain sight: VLMs overlook their visual representations Paper • 2506.08008 • Published about 1 month ago • 8 • 1
B-score: Detecting biases in large language models using response history Paper • 2505.18545 • Published May 24 • 31 • 2
VideoGameQA-Bench: Evaluating Vision-Language Models for Video Game Quality Assurance Paper • 2505.15952 • Published May 21 • 20 • 2
Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path? Paper • 2502.15657 • Published Feb 21 • 5 • 2
ZeroBench: An Impossible Visual Benchmark for Contemporary Large Multimodal Models Paper • 2502.09696 • Published Feb 13 • 44 • 5
VideoGameBunny: Towards vision assistants for video games Paper • 2407.15295 • Published Jul 21, 2024 • 22 • 6
EXAONE 3.5: Series of Large Language Models for Real-world Use Cases Paper • 2412.04862 • Published Dec 6, 2024 • 51 • 5