Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone Paper • 2404.14219 • Published Apr 22, 2024 • 257
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models Paper • 2310.08659 • Published Oct 12, 2023 • 28
Less is More: Task-aware Layer-wise Distillation for Language Model Compression Paper • 2210.01351 • Published Oct 4, 2022 • 2