view article Article Simplifying Alignment: From RLHF to Direct Preference Optimization (DPO) By ariG23498 • Jan 19 • 19
view article Article Finetune Stable Diffusion Models with DDPO via TRL By metric-space and 3 others • Sep 29, 2023 • 16