Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't Paper • 2503.16219 • Published 4 days ago • 38
SOAP: Improving and Stabilizing Shampoo using Adam Paper • 2409.11321 • Published Sep 17, 2024 • 1
Small Models Struggle to Learn from Strong Reasoners Paper • 2502.12143 • Published Feb 17 • 32
Granite Data Collection This collection has a set of artifacts which are related to curating and evaluating datasets used for Granite models • 16 items • Updated 24 days ago • 4
view article Article Introducing Three New Serverless Inference Providers: Hyperbolic, Nebius AI Studio, and Novita 🔥 Feb 18 • 95
SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model Paper • 2502.02737 • Published Feb 4 • 208
The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training Paper • 2501.18965 • Published Jan 31 • 7