kiran

kira

5 32 199

ki6an

AI & ML interests

agi

Organizations

upvoted a paper 5 months ago

On Data Engineering for Scaling LLM Terminal Capabilities

Paper • 2602.21193 • Published Feb 24 • 103

upvoted a collection 5 months ago

Nemotron-Post-Training-v3

Collection

Collection of datasets used in the post-training phase of Nemotron Nano, Super, and Ultra v3. • 50 items • Updated 30 days ago • 175

upvoted an article 7 months ago

Article

Deriving the PPO Loss from First Principles

garg-aayush

•

Dec 25, 2025

• 46

upvoted a collection 7 months ago

GLM-4.7

Collection

3 items • Updated Jan 19 • 68

upvoted an article 7 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 782

upvoted a collection 8 months ago

Olmo 3 Post-training

Collection

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated Dec 23, 2025 • 58

upvoted a collection 12 months ago

Tool Use Reasoning

Collection

A collection of tool use reasoning dataset in Hermes format • 5 items • Updated Jul 23, 2025 • 10

upvoted 3 papers about 1 year ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25, 2025 • 49

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30, 2025 • 283

AReaL: A Large-Scale Asynchronous Reinforcement Learning System for Language Reasoning

Paper • 2505.24298 • Published May 30, 2025 • 34

upvoted a collection over 1 year ago

Llama 4

Collection

Llama 4 release • 13 items • Updated Apr 29, 2025 • 740

upvoted a paper over 1 year ago

Sigma: Differential Rescaling of Query, Key and Value for Efficient Language Models

Paper • 2501.13629 • Published Jan 23, 2025 • 48

upvoted 3 collections over 1 year ago

upvoted 2 collections almost 2 years ago

Mini Pretrain Datasets

Collection

9 items • Updated Jul 9, 2024 • 12

Useful Pretrain-Datasets

Collection

pretrain-datasets with (maybe) good quality • 21 items • Updated Mar 12, 2025 • 2

upvoted 2 collections about 2 years ago

Yi-1.5 (2024/05)

Collection

10 items • Updated May 20, 2024 • 93

GPT-4 generated datasets

Collection

Collection of some GPT-4 generated datasets. It may be useful for those looking for the best-quality datasets to train competitive LLMs. • 18 items • Updated Apr 16, 2024 • 10

upvoted a paper about 2 years ago

Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length

Paper • 2404.08801 • Published Apr 12, 2024 • 66

kiran

AI & ML interests

Organizations

kira's activity

Deriving the PPO Loss from First Principles

SmolLM3: smol, multilingual, long-context reasoner