Power Scheduler: A Batch Size and Token Number Agnostic Learning Rate Scheduler Paper • 2408.13359 • Published Aug 23, 2024 • 25
The infrastructure powering IBM's Gen AI model development Paper • 2407.05467 • Published Jul 7, 2024 • 2
Granite-Function Calling Model: Introducing Function Calling Abilities via Multi-task Learning of Granular Tasks Paper • 2407.00121 • Published Jun 27, 2024
Diversity Measurement and Subset Selection for Instruction Tuning Datasets Paper • 2402.02318 • Published Feb 4, 2024 • 2
Granite Code Models: A Family of Open Foundation Models for Code Intelligence Paper • 2405.04324 • Published May 7, 2024 • 23