Understanding the performance gap between online and offline alignment algorithms Paper • 2405.08448 • Published May 14 • 14
OpenRLHF: An Easy-to-use, Scalable and High-performance RLHF Framework Paper • 2405.11143 • Published May 20 • 34