Does Math Reasoning Improve General LLM Capabilities? Understanding Transferability of LLM Reasoning Paper • 2507.00432 • Published 3 days ago • 44
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published 7 days ago • 45
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published 7 days ago • 45
Mind2Web 2: Evaluating Agentic Search with Agent-as-a-Judge Paper • 2506.21506 • Published 7 days ago • 45 • 1
UGround Collection Navigating GUIs as Humans Do: Universal Visual Grounding for GUI Agents (ICLR'25 Oral) • 10 items • Updated May 4 • 7
An Illusion of Progress? Assessing the Current State of Web Agents Paper • 2504.01382 • Published Apr 2 • 3
WebDreamer Collection Is Your LLM Secretly a World Model of the Internet? Model-Based Planning for Web Agents • 6 items • Updated Apr 14 • 5
Mind2Web Collection Towards Generalist Agents for the Web (NeurIPS'23 Spotlight) • 7 items • Updated Apr 9
An Illusion of Progress? Assessing the Current State of Web Agents Paper • 2504.01382 • Published Apr 2 • 3