EvolProver: Advancing Automated Theorem Proving by Evolving Formalized Problems via Symmetry and Difficulty Paper • 2510.00732 • Published 12 days ago • 5
AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness Paper • 2507.01702 • Published Jul 2 • 2
Codec Does Matter: Exploring the Semantic Shortcoming of Codec for Audio Language Model Paper • 2408.17175 • Published Aug 30, 2024 • 5
MCP-Universe: Benchmarking Large Language Models with Real-World Model Context Protocol Servers Paper • 2508.14704 • Published Aug 20 • 42
view article Article ✴️ ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use By Ziyang and 1 other • Jan 3 • 19