Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report Paper • 2508.01059 • Published Aug 1 • 33
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens Paper • 2508.01191 • Published Aug 2 • 234
On the Generalization of SFT: A Reinforcement Learning Perspective with Reward Rectification Paper • 2508.05629 • Published Aug 7 • 170
GLM-4.5: Agentic, Reasoning, and Coding (ARC) Foundation Models Paper • 2508.06471 • Published Aug 8 • 175
UI-Venus Technical Report: Building High-performance UI Agents with RFT Paper • 2508.10833 • Published 25 days ago • 41
The Landscape of Agentic Reinforcement Learning for LLMs: A Survey Paper • 2509.02547 • Published 6 days ago • 164