What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective Paper • 2410.23743 • Published 18 days ago • 58
Beyond Fine-tuning: Unleashing the Potential of Continuous Pretraining for Clinical LLMs Paper • 2409.14988 • Published Sep 23 • 21
From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data Paper • 2406.19292 • Published Jun 27 • 1