Why don't the available checkpoints start from a lower amount of steps/seen data?
#6
by
user09180912480
- opened
7B model has checkpoint from 150-1000 steps with only 1B tokens seen. Why is it not the same for the 13B model?
Yes, I had a similar question—unless I'm misunderstanding, the first checkpoint for 13B available is "stage1-step102500-tokens860B"? Is that just an issue with accessing the branches?
Hey @seantrott , @user09180912480 , I started the process of uploading earlier checkpoints. Will get uploaded in couple of days.
That's great, thank you so much!
amanrangapur
changed discussion status to
closed
That's awesome, thank you!