Running 3.26k 3.26k The Ultra-Scale Playbook ๐ The ultimate guide to training LLM on large GPU Clusters
Quartet: Native FP4 Training Can Be Optimal for Large Language Models Paper โข 2505.14669 โข Published May 20 โข 77
T-pro-2.0 Collection Hybrid reasoning model based on Qwen3 32B โข 12 items โข Updated Jul 18 โข 30