Running 1.24k 1.24k The Ultra-Scale Playbook 🌌 The ultimate guide to training LLM on large GPU Clusters
Step-Video-T2V Technical Report: The Practice, Challenges, and Future of Video Foundation Model Paper • 2502.10248 • Published 8 days ago • 49
Congliu/Chinese-DeepSeek-R1-Distill-data-110k Viewer • Updated 1 day ago • 110k • 1.61k • 300
CodeI/O Collection Collection for CodeI/O @ https://codei-o.github.io/ • 15 items • Updated 9 days ago • 6
CodeI/O: Condensing Reasoning Patterns via Code Input-Output Prediction Paper • 2502.07316 • Published 11 days ago • 43
QuEST: Stable Training of LLMs with 1-Bit Weights and Activations Paper • 2502.05003 • Published 15 days ago • 41
Reasoning Datasets Collection Distilled synthetic Reasoning datasets • 7 items • Updated 20 days ago • 54