MoDoMoDo: Multi-Domain Data Mixtures for Multimodal LLM Reinforcement Learning Paper • 2505.24871 • Published May 30 • 21
HardTests: Synthesizing High-Quality Test Cases for LLM Coding Paper • 2505.24098 • Published May 30 • 44
xLAM-2 Collection A family of Large Action Model for multi-turn conversation and tool-use • 10 items • Updated May 5 • 18
Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding Paper • 2411.04282 • Published Nov 6, 2024 • 36
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents Paper • 2408.07060 • Published Aug 13, 2024 • 43