Ahmad Khan commited on
Commit
03e7b52
Β·
1 Parent(s): 96abf36

Fix variable name for extra memory note

Browse files
Files changed (1) hide show
  1. src/index.html +1 -1
src/index.html CHANGED
@@ -452,7 +452,7 @@
452
  <div class="note-box">
453
  <p class="note-box-title">πŸ“ Note</p>
454
  <div class="note-box-content">
455
- <p>Some libraries store grads in FP32, which would require an additional <d-math>m_{params\_fp32} = 4 * N</d-math> memory. This is done, for example, in Nanotron, because BF16 is lossy for smaller values and we always prioritize stability. See <a href="https://github.com/microsoft/DeepSpeed/issues/1773">this DeepSpeed issue</a> for more information.</p>
456
  </div>
457
  </div>
458
 
 
452
  <div class="note-box">
453
  <p class="note-box-title">πŸ“ Note</p>
454
  <div class="note-box-content">
455
+ <p>Some libraries store grads in FP32, which would require an additional <d-math>m_{grad\_fp32} = 4 * N</d-math> memory. This is done, for example, in Nanotron, because BF16 is lossy for smaller values and we always prioritize stability. See <a href="https://github.com/microsoft/DeepSpeed/issues/1773">this DeepSpeed issue</a> for more information.</p>
456
  </div>
457
  </div>
458