Safetensors
English
olmo2

Why don't the available checkpoints start from a lower amount of steps/seen data?

#6
by user09180912480 - opened

7B model has checkpoint from 150-1000 steps with only 1B tokens seen. Why is it not the same for the 13B model?

Yes, I had a similar question—unless I'm misunderstanding, the first checkpoint for 13B available is "stage1-step102500-tokens860B"? Is that just an issue with accessing the branches?

Hey @seantrott , @user09180912480 , I started the process of uploading earlier checkpoints. Will get uploaded in couple of days.

That's great, thank you so much!

amanrangapur changed discussion status to closed

Hey @seantrott , uploaded lot of early stage checkpoints.

That's awesome, thank you!

Sign up or log in to comment