Benchmarking Optimizers for Large Language Model Pretraining Paper • 2509.01440 • Published Sep 1 • 24