Beyond I.I.D.: Three Levels of Generalization for Question Answering on Knowledge Bases Paper • 2011.07743 • Published Nov 16, 2020
KoLA: Carefully Benchmarking World Knowledge of Large Language Models Paper • 2306.09296 • Published Jun 15, 2023 • 19
A Systematic Investigation of KB-Text Embedding Alignment at Scale Paper • 2106.01586 • Published Jun 3, 2021
Bringing Back the Context: Camera Trap Species Identification as Link Prediction on Multimodal Knowledge Graphs Paper • 2401.00608 • Published Dec 31, 2023 • 2
Middleware for LLMs: Tools Are Instrumental for Language Agents in Complex Environments Paper • 2402.14672 • Published Feb 22, 2024 • 1
Don't Generate, Discriminate: A Proposal for Grounding Language Models to Real-World Environments Paper • 2212.09736 • Published Dec 19, 2022
HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models Paper • 2405.14831 • Published May 23, 2024 • 4
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Paper • 2408.06327 • Published Aug 12, 2024 • 17
Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents Paper • 2410.05243 • Published Oct 7, 2024 • 19
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery Paper • 2410.05080 • Published Oct 7, 2024 • 21
Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale Paper • 2409.17115 • Published Sep 25, 2024 • 62
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents Paper • 2408.06327 • Published Aug 12, 2024 • 17