Accelerated Preference Optimization for Large Language Model Alignment Paper • 2410.06293 • Published Oct 8, 2024 • 5
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13
MARS: Unleashing the Power of Variance Reduction for Training Large Models Paper • 2411.10438 • Published Nov 15, 2024 • 13
DPLM-2: A Multimodal Diffusion Protein Language Model Paper • 2410.13782 • Published Oct 17, 2024 • 20
An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models Paper • 2408.00724 • Published Aug 1, 2024 • 1
General Preference Modeling with Preference Representations for Aligning Language Models Paper • 2410.02197 • Published Oct 3, 2024 • 8
ProteinBench: A Holistic Evaluation of Protein Foundation Models Paper • 2409.06744 • Published Sep 10, 2024 • 8
view post Post 704 We've open-sourced the code and models for Self-Play Preference Optimization (SPPO)! 🚀🚀🚀🤗paper: Self-Play Preference Optimization for Language Model Alignment (2405.00675) ⭐ code: https://github.com/uclaml/SPPO🤗models: UCLA-AGI/sppo-6635fdd844f2b2e4a94d0b9a 🔥 3 3 + Reply
Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision Paper • 2403.09472 • Published Mar 14, 2024 • 1
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1, 2024 • 27
Self-Play Preference Optimization for Language Model Alignment Paper • 2405.00675 • Published May 1, 2024 • 27
DecompOpt: Controllable and Decomposed Diffusion Models for Structure-based Molecular Optimization Paper • 2403.13829 • Published Mar 7, 2024
Horizon-free Reinforcement Learning in Adversarial Linear Mixture MDPs Paper • 2305.08359 • Published May 15, 2023
Risk Bounds of Accelerated SGD for Overparameterized Linear Regression Paper • 2311.14222 • Published Nov 23, 2023
How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression? Paper • 2310.08391 • Published Oct 12, 2023
Variance-Aware Regret Bounds for Stochastic Contextual Dueling Bandits Paper • 2310.00968 • Published Oct 2, 2023