Large Language Models Cannot Self-Correct Reasoning Yet Paper • 2310.01798 • Published Oct 3, 2023 • 33
Premise Order Matters in Reasoning with Large Language Models Paper • 2402.08939 • Published Feb 14 • 25
Chain of Thought Empowers Transformers to Solve Inherently Serial Problems Paper • 2402.12875 • Published Feb 20 • 13
ReAct: Synergizing Reasoning and Acting in Language Models Paper • 2210.03629 • Published Oct 6, 2022 • 14
WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents Paper • 2207.01206 • Published Jul 4, 2022 • 2
Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs Paper • 2406.11695 • Published Jun 17 • 1
Fine-Tuning and Prompt Optimization: Two Great Steps that Work Better Together Paper • 2407.10930 • Published Jul 15
SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering Paper • 2405.15793 • Published May 6 • 1
OpenDevin: An Open Platform for AI Software Developers as Generalist Agents Paper • 2407.16741 • Published Jul 23 • 68
WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? Paper • 2403.07718 • Published Mar 12 • 1
WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks Paper • 2407.05291 • Published Jul 7 • 1
Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping Paper • 2402.14083 • Published Feb 21 • 47
Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces Paper • 2410.09918 • Published Oct 13 • 3