Universal Checkpointing: Efficient and Flexible Checkpointing for Large Scale Distributed Training Paper • 2406.18820 • Published Jun 27, 2024
The Case for Co-Designing Model Architectures with Hardware Paper • 2401.14489 • Published Jan 25, 2024 • 3
Datasets: A Community Library for Natural Language Processing Paper • 2109.02846 • Published Sep 7, 2021 • 14
OBELICS: An Open Web-Scale Filtered Dataset of Interleaved Image-Text Documents Paper • 2306.16527 • Published Jun 21, 2023 • 46
What Language Model to Train if You Have One Million GPU Hours? Paper • 2210.15424 • Published Oct 27, 2022 • 2
BLOOM: A 176B-Parameter Open-Access Multilingual Language Model Paper • 2211.05100 • Published Nov 9, 2022 • 32