ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3 • 78
view article Article Introducing smolagents: simple agents that write actions in code. By m-ric and 2 others • Dec 31, 2024 • 1.04k
ZClip: Adaptive Spike Mitigation for LLM Pre-Training Paper • 2504.02507 • Published Apr 3 • 78
Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21 • 5
Komodo: A Linguistic Expedition into Indonesia's Regional Languages Paper • 2403.09362 • Published Mar 14, 2024 • 6
Variance Control via Weight Rescaling in LLM Pre-training Paper • 2503.17500 • Published Mar 21 • 5