Tool Use Reasoning Collection A collection of tool use reasoning dataset in Hermes format • 5 items • Updated Jul 23 • 8
OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling Paper • 2506.20512 • Published Jun 25 • 46
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning Paper • 2505.24726 • Published May 30 • 271
AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning Paper • 2505.24298 • Published May 30 • 28
Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models Paper • 2501.13629 • Published Jan 23 • 49
xLAM models Collection xLAM: A Family of Large Action Models to Empower AI Agent Systems: https://github.com/SalesforceAIResearch/xLAM • 21 items • Updated Jul 28 • 53
Qwen2.5-Coder Collection Code-specific model series based on Qwen2.5 • 40 items • Updated Jul 21 • 335
SmolLM2 Collection State-of-the-art compact LLMs for on-device applications: 1.7B, 360M, 135M • 16 items • Updated May 5 • 289
Useful Pretrain-Datasets Collection pretrain-datasets with (maybe) good quality • 21 items • Updated Mar 12 • 1
GPT-4 generated datasets Collection Collection of some GPT-4 generated datasets. It may be useful for those looking for the best-quality datasets to train competitive LLMs. • 18 items • Updated Apr 16, 2024 • 9
Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length Paper • 2404.08801 • Published Apr 12, 2024 • 68
Scalable Pre-training of Large Autoregressive Image Models Paper • 2401.08541 • Published Jan 16, 2024 • 39
E^2-LLM: Efficient and Extreme Length Extension of Large Language Models Paper • 2401.06951 • Published Jan 13, 2024 • 27
Papers about model merging Collection referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated Feb 13, 2024 • 14