Anton Lozhkov's picture

Anton Lozhkov

anton-l

·

AI & ML interests

Generative Models, Distributed Training, Photo and Video Enhancement

Recent Activity

upvoted a paper 5 days ago

Scaling Laws for Mixture Pretraining Under Data Constraints

upvoted an article 2 months ago

Introducing Storage Buckets on the Hugging Face Hub

liked a Space 5 months ago

lvwerra/jagged-data-frontier

View all activity

Organizations

upvoted a paper 5 days ago

Scaling Laws for Mixture Pretraining Under Data Constraints

Paper • 2605.12715 • Published 8 days ago • 4

upvoted an article 2 months ago

Article

Introducing Storage Buckets on the Hugging Face Hub

+10

Wauplin, coyotte508, XciD, victor, julien-c, lhoestq, pierric, Sylvestre, hlarcher, rajatarya, seanses, assafvayner

•

Mar 10

• 194

upvoted an article 11 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

+21

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 776

upvoted a collection over 1 year ago

OpenR1-Math

Dataset and SFT model distilled from DeepSeek-R1. Check out our blog post for more details: https://huggingface.co/blog/open-r1/update-2 • 3 items • Updated May 13, 2025 • 9

upvoted an article over 1 year ago

Article

Open R1: Update #2

open-r1

•

Feb 10, 2025

• 218

upvoted a paper over 1 year ago

SmolLM2: When Smol Goes Big -- Data-Centric Training of a Small Language Model

Paper • 2502.02737 • Published Feb 4, 2025 • 258

upvoted a collection over 1 year ago

📐 FineMath

FineMath datasets and ablation models • 14 items • Updated May 5, 2025 • 26

upvoted a paper over 1 year ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 133

upvoted 2 articles almost 2 years ago

Article

SmolLM - blazingly fast and remarkably powerful

+1

loubnabnl, anton-l, eliebak

•

Jul 16, 2024

• 455

Article

Ethics and Society Newsletter #6: Building Better AI: The Importance of Data Quality

+8

evijit, frimelle, yjernite, meg, irenesolaiman, dvilasuero, fdaudens, BrigitteTousi, giadap, sasha

•

Jun 24, 2024

• 34

upvoted a paper almost 2 years ago

The FineWeb Datasets: Decanting the Web for the Finest Text Data at Scale

Paper • 2406.17557 • Published Jun 25, 2024 • 102

upvoted a collection almost 2 years ago

📀 Dataset comparison models

1.8B models trained on 350BT to compare different pretraining datasets • 8 items • Updated Jun 12, 2024 • 42

upvoted a paper about 2 years ago

StarCoder 2 and The Stack v2: The Next Generation

Paper • 2402.19173 • Published Feb 29, 2024 • 156

upvoted a paper over 2 years ago

Zephyr: Direct Distillation of LM Alignment

Paper • 2310.16944 • Published Oct 25, 2023 • 123

upvoted a paper almost 3 years ago

The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only

Paper • 2306.01116 • Published Jun 1, 2023 • 45