Running 2.78k 2.78k The Ultra-Scale Playbook π The ultimate guide to training LLM on large GPU Clusters
Running 573 573 Scaling test-time compute π Enhance math problem solving by scaling test-time compute
HuggingFaceTB/SmolLM-135M-Instruct Text Generation β’ 0.1B β’ Updated Sep 4, 2024 β’ 13.6k β’ 119