Why is the model 1B version 29 GB?
#14
by
samedii
- opened
Why is the model 1B version 29 GB when the 7B version is 32 GB?
MolmoE-1B is a multimodal Mixture-of-Experts LLM with 1.5B active and 7.2B total parameters based on OLMoE-1B-7B-0924.
As it has 7B total parameters; 1B refers to how many are used per forward pass which is a proxy for its speed.
Thank you!
samedii
changed discussion status to
closed