Critique Fine-Tuning: Learning to Critique is More Effective than Learning to Imitate Paper • 2501.17703 • Published 1 day ago • 28
Parameters vs FLOPs: Scaling Laws for Optimal Sparsity for Mixture-of-Experts Language Models Paper • 2501.12370 • Published 9 days ago • 8
view article Article Preference Tuning LLMs with Direct Preference Optimization Methods Jan 18, 2024 • 43
Preference Datasets for DPO Collection This collection contains a list of curated preference datasets for DPO fine-tuning for intent alignment of LLMs • 7 items • Updated Dec 11, 2024 • 39
view article Article Train 400x faster Static Embedding Models with Sentence Transformers 16 days ago • 129
view article Article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes Aug 17, 2022 • 70