view article Article SmolLM3: smol, multilingual, long-context reasoner By loubnabnl and 22 others โข 9 days ago โข 541
MiniMax-M1 Collection MiniMax-M1, the world's first open-weight, large-scale hybrid-attention reasoning model. โข 6 items โข Updated 14 days ago โข 108
General-Reasoner Collection Advancing LLMs' general reasoning capabilities โข 9 items โข Updated 22 days ago โข 4
view article Article Selective fine-tuning of Language Models with Spectrum By anakin87 โข Sep 3, 2024 โข 36
Running 2.82k 2.82k The Ultra-Scale Playbook ๐ The ultimate guide to training LLM on large GPU Clusters
view article Article The N Implementation Details of RLHF with PPO By vwxyzjn and 2 others โข Oct 24, 2023 โข 61
Running 574 574 Scaling test-time compute ๐ Enhance math problem solving by scaling test-time compute