Checkpts 9b vs 9b-it storage

#10
by Hemanth-thunder - opened

hello, between these two checkpoints (9b and 9b-it) is that the new one seems to be uploading extra shards.
chkpt9b-base.png
chkpt9b-it.png

gemma-2-9b-it --> checkpoint shards with 4
gemma-2-9b --> checkpoint shards with 8

gemma-2-9b - This model is in float32 and the other one float16 , hence the extra shards I believe

hello @rashmi I was wondering the same thing. Do models have different checkpoints for float32 and float16? No. It seems that a dtype for different precision can convert on the fly.

Sign up or log in to comment