MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning Paper • 2505.24846 • Published May 30 • 15
SafeScientist: Toward Risk-Aware Scientific Discoveries by LLM Agents Paper • 2505.23559 • Published May 29 • 12
ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind Paper • 2505.22961 • Published May 29 • 8
Time-R1 Collection Time-R1: Framework and resources for endowing LLMs with comprehensive temporal reasoning (understanding, prediction, creative generation). • 7 items • Updated Jun 1 • 1
Maintaining Adversarial Robustness in Continuous Learning Paper • 2402.11196 • Published Feb 17, 2024 • 1
Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL Paper • 2505.02391 • Published May 5 • 25