Nathan Lambert's picture

Nathan Lambert

natolambert

·

https://www.natolambert.com/

AI & ML interests

Reinforcement learning, Ethics, Robotics, Dynamics Models

Recent Activity

liked a model 5 days ago

inclusionAI/Ling-2.6-flash

liked a model 15 days ago

openai/privacy-filter

authored a paper 22 days ago

The ATOM Report: Measuring the Open Language Model Ecosystem

View all activity

Organizations

upvoted a collection about 1 month ago

Gemma 4

12 items • Updated 1 day ago • 754

upvoted 2 collections about 2 months ago

NVIDIA Nemotron v3

Open, Production-ready Enterprise Models • 18 items • Updated 9 days ago • 284

Nemotron-Post-Training-v3

Collection of datasets used in the post-training phase of Nemotron Nano and Super v3. • 28 items • Updated 16 days ago • 135

upvoted an article about 2 months ago

Article

How NVIDIA Builds Open Data for AI

Mar 10

•

15

upvoted a collection 2 months ago

Olmo Hybrid

6 items • Updated Mar 5 • 26

upvoted a paper 3 months ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

Paper • 2510.24702 • Published Oct 28, 2025 • 31

upvoted a collection 5 months ago

Olmo 3.1

The latest members of the Olmo 3 family: another 3 weeks of RL for 32B Think, the 32B Instruct model, large post-training research datasets... • 9 items • Updated Dec 23, 2025 • 51

upvoted a paper 5 months ago

olmOCR 2: Unit Test Rewards for Document OCR

Paper • 2510.19817 • Published Oct 22, 2025 • 16

upvoted a collection 6 months ago

Olmo 3 Post-training

All artifacts for post-training Olmo 3. Datasets follow the model that resulted from training on them. • 32 items • Updated Dec 23, 2025 • 54

upvoted a collection 7 months ago

Olmo 3

Artifacts for the Olmo 3 release. • 7 items • Updated Mar 2 • 169

upvoted an article 9 months ago

Article

Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face

+3

Jul 29, 2025

•

222

upvoted a paper 10 months ago

Skywork-Reward-V2: Scaling Preference Data Curation via Human-AI Synergy

Paper • 2507.01352 • Published Jul 2, 2025 • 60

upvoted a collection 10 months ago

Reward Models 06-2025

Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge • 8 items • Updated 16 days ago • 24

upvoted 2 collections 11 months ago

Reward Bench 2

Datasets, spaces, and models for Reward Bench 2 benchmark and paper! • 11 items • Updated Dec 23, 2025 • 16

Common Pile v0.1

All resources related to Common Pile v0.1, an 8TB dataset of public domain and openly licensed text • 4 items • Updated Jun 6, 2025 • 40

upvoted a collection 12 months ago

OpenVision

27 items • Updated Aug 15, 2025 • 33

upvoted a collection about 1 year ago

Qwen3

84 items • Updated Dec 31, 2025 • 1.77k

upvoted a paper about 1 year ago

Reinforcement Learning from Human Feedback

Paper • 2504.12501 • Published Apr 16, 2025 • 4

upvoted a collection about 1 year ago

OLMoE (January 2025)

Improved OLMoE for iOS app. Read more: https://allenai.org/blog/olmoe-app • 10 items • Updated Dec 23, 2025 • 16

upvoted an article over 1 year ago

Article

Putting RL back in RLHF

Jun 12, 2024

•

111