GPU requirement
#20
by
meetzuber
- opened
How much VRAM is needed for run fp16/bf16 model?
402B params means 402 billion numbers. If each number is 16bits, each number is two bytes. 402*2=804. You need 804GB VRAM. That's around 6x H200
Again, that's if you want to load the entire model in VRAM (which means super fast inference). If you don't need that you can use 17B*2=34GB VRAM.