CODESIM: Multi-Agent Code Generation and Problem Solving through Simulation-Driven Planning and Debugging Paper • 2502.05664 • Published Feb 8 • 24
AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation Paper • 2312.13010 • Published Dec 20, 2023 • 6
HyperAgent: Generalist Software Engineering Agents to Solve Coding Tasks at Scale Paper • 2409.16299 • Published Sep 9, 2024 • 12
Vibe Coding vs. Agentic Coding: Fundamentals and Practical Implications of Agentic AI Paper • 2505.19443 • Published May 26 • 15
GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents Paper • 2505.23671 • Published May 29 • 3
SWE-Flow: Synthesizing Software Engineering Data in a Test-Driven Manner Paper • 2506.09003 • Published Jun 10 • 18
Skywork-SWE: Unveiling Data Scaling Laws for Software Engineering in LLMs Paper • 2506.19290 • Published Jun 24 • 52
SWE-Perf: Can Language Models Optimize Code Performance on Real-World Repositories? Paper • 2507.12415 • Published Jul 16 • 42
GitChameleon: Evaluating AI Code Generation Against Python Library Version Incompatibilities Paper • 2507.12367 • Published Jul 16 • 6
SWE-Debate: Competitive Multi-Agent Debate for Software Issue Resolution Paper • 2507.23348 • Published Jul 31 • 11
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 181
Training Long-Context, Multi-Turn Software Engineering Agents with Reinforcement Learning Paper • 2508.03501 • Published Aug 5 • 56
Agentic Software Engineering: Foundational Pillars and a Research Roadmap Paper • 2509.06216 • Published 21 days ago • 7
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks? Paper • 2509.16941 • Published 8 days ago • 19