Safeguard Fine-Tuned LLMs Through Pre- and Post-Tuning Model Merging Paper • 2412.19512 • Published Dec 27, 2024 • 8
Course-Correction: Safety Alignment Using Synthetic Preferences Paper • 2407.16637 • Published Jul 23, 2024 • 26