VisualWebBench: How Far Have Multimodal LLMs Evolved in Web Page Understanding and Grounding? Paper • 2404.05955 • Published Apr 9, 2024
Harnessing Webpage UIs for Text-Rich Visual Understanding Paper • 2410.13824 • Published Oct 17, 2024 • 32