In config.json we only have n_embed=1024 while BLOOM was trained with a sequence length of 2048.
#9
by
TingchenFu
- opened
According to the BLOOM paper BLOOM: A 176B-Parameter Open-Access Multilingual Language Model, all bloom model was trained with a sequence length of 2048, but why the n_embed is only 1024 in config.json?
2048 is the hidden dimension, not the sequence size. That same parameter is 14336 for the full-sized model: https://huggingface.co/bigscience/bloom/blob/main/config.json#L16
christopher
changed discussion status to
closed