Stephen Oates PRO

soates

AI & ML interests

None yet

Recent Activity

upvoted an article 23 days ago

Deriving the PPO Loss from First Principles

upvoted an article about 1 month ago

How We Use Claude Code Skills to Run 1,000+ ML Experiments a Day

upvoted a collection about 1 month ago

Physics of Language Models: Part 4.2

View all activity

Organizations

None yet

liked a model 6 months ago

Menlo/Lucy-128k

Text Generation • 2B • Updated Aug 4, 2025 • 196 • 108

liked a model 7 months ago

chandar-lab/NeoBERT

Feature Extraction • 0.2B • Updated Mar 25, 2025 • 2.66k • 187

liked a Space 11 months ago

The Ultra-Scale Playbook

🌌

3.65k

The ultimate guide to training LLM on large GPU Clusters

liked a model about 1 year ago

Datou1111/shou_xin

Text-to-Image • Updated Mar 16, 2025 • 182 • • 878

liked a model over 1 year ago

lamm-mit/LifeGPT

Updated Sep 19, 2024 • 9

liked a Space over 1 year ago

Open-LLM performances are plateauing, let’s make the leaderboard steep again

🏔

125

Explore and compare advanced language models on a new leaderboard

liked a model over 1 year ago

nisten/Biggie-SmoLlm-0.15B-Base

Text Generation • 0.2B • Updated Aug 7, 2024 • 1.08k • 241

liked 3 Spaces over 1 year ago

Gpt2 Multiplication Predictor

📈

Multiply large numbers using different reasoning methods

FineWeb: decanting the web for the finest text data at scale

🍷

1.26k

Generate high-quality text data for LLMs using FineWeb

Phi-3 WebGPU

🚀

291

A private and powerful AI that runs locally in your browser

liked a model over 1 year ago

rombodawg/test_dataset_Codellama-3-8B

Text Generation • 8B • Updated May 4, 2024 • 4 • 78

Stephen Oates PRO

AI & ML interests

Recent Activity

Organizations

soates's activity

The Ultra-Scale Playbook

Open-LLM performances are plateauing, let’s make the leaderboard steep again

Gpt2 Multiplication Predictor

FineWeb: decanting the web for the finest text data at scale

Phi-3 WebGPU