Self-Improving Agents - a alexngai Collection

alexngai 's Collections

Autonomous Research

Automated Research

Test-Time Compute/Optimal Scaling

Self-Improving Agents

Codegen Benchmarks

Self-Improving Agents

updated Jan 29

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Paper • 2410.22304 • Published Oct 29, 2024 • 18
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization

Paper • 2410.19609 • Published Oct 25, 2024 • 17
Adapting While Learning: Grounding LLMs for Scientific Problems with Intelligent Tool Usage Adaptation

Paper • 2411.00412 • Published Nov 1, 2024 • 10
Improving Autonomous AI Agents with Reflective Tree Search and Self-Learning

Paper • 2410.02052 • Published Oct 2, 2024 • 9
VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment

Paper • 2410.01679 • Published Oct 2, 2024 • 25
DOTS: Learning to Reason Dynamically in LLMs via Optimal Reasoning Trajectories Search

Paper • 2410.03864 • Published Oct 4, 2024 • 12
AFlow: Automating Agentic Workflow Generation

Paper • 2410.10762 • Published Oct 14, 2024
Boundless Socratic Learning with Language Games

Paper • 2411.16905 • Published Nov 25, 2024 • 2
Enabling Scalable Oversight via Self-Evolving Critic

Paper • 2501.05727 • Published Jan 10 • 70
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains

Paper • 2501.05707 • Published Jan 10 • 20
Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Paper • 2501.11425 • Published Jan 20 • 93
Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 26