Running on CPU Upgrade Featured 3.04k The Smol Training Playbook 📚 3.04k The secrets to building world-class LLMs
MixCPT Collection Rethinking Multilingual Continual Pretraining: Data Mixing for Adapting LLMs Across Languages and Resources • 41 items • Updated Oct 21, 2025 • 1
Test-Time Scaling of Reasoning Models for Machine Translation Paper • 2510.06471 • Published Oct 7, 2025 • 1
Test-Time Scaling of Reasoning Models for Machine Translation Paper • 2510.06471 • Published Oct 7, 2025 • 1 • 2