CodeRAG-Bench: Can Retrieval Augment Code Generation? Paper • 2406.14497 • Published Jun 20, 2024 • 1
TheAgentCompany: Benchmarking LLM Agents on Consequential Real World Tasks Paper • 2412.14161 • Published about 1 month ago • 50
Confidence-Building Measures for Artificial Intelligence: Workshop Proceedings Paper • 2308.00862 • Published Aug 1, 2023