Qwen/Qwen3-235B-A22B-Instruct-2507 Text Generation • 235B • Updated 6 days ago • 85k • • 649
google/timesfm-2.0-500m-pytorch Time Series Forecasting • 0.5B • Updated Apr 16 • 9.22k • 196
twinkle-ai/Llama-3.2-3B-F1-Reasoning-Instruct Text Generation • 4B • Updated Apr 24 • 46 • 44
Running 3.1k 3.1k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters