In config.json we only have n_embed=1024 while BLOOM was trained with a sequence length of 2048.

#9
by TingchenFu - opened

According to the BLOOM paper BLOOM: A 176B-Parameter Open-Access Multilingual Language Model, all bloom model was trained with a sequence length of 2048, but why the n_embed is only 1024 in config.json?

BigScience Workshop org

2048 is the hidden dimension, not the sequence size. That same parameter is 14336 for the full-sized model: https://huggingface.co/bigscience/bloom/blob/main/config.json#L16

christopher changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment