Towards a Unified View of Parameter-Efficient Transfer Learning Paper • 2110.04366 • Published Oct 8, 2021 • 3
Better May Not Be Fairer: A Study on Subgroup Discrepancy in Image Classification Paper • 2212.08649 • Published Dec 16, 2022
LightSeq: Sequence Level Parallelism for Distributed Training of Long Context Transformers Paper • 2310.03294 • Published Oct 5, 2023 • 2
Evaluating Large Language Models on Controlled Generation Tasks Paper • 2310.14542 • Published Oct 23, 2023
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12, 2024 • 68