Ahmad Khan
commited on
Commit
Β·
03e7b52
1
Parent(s):
96abf36
Fix variable name for extra memory note
Browse files- src/index.html +1 -1
src/index.html
CHANGED
@@ -452,7 +452,7 @@
|
|
452 |
<div class="note-box">
|
453 |
<p class="note-box-title">π Note</p>
|
454 |
<div class="note-box-content">
|
455 |
-
<p>Some libraries store grads in FP32, which would require an additional <d-math>m_{
|
456 |
</div>
|
457 |
</div>
|
458 |
|
|
|
452 |
<div class="note-box">
|
453 |
<p class="note-box-title">π Note</p>
|
454 |
<div class="note-box-content">
|
455 |
+
<p>Some libraries store grads in FP32, which would require an additional <d-math>m_{grad\_fp32} = 4 * N</d-math> memory. This is done, for example, in Nanotron, because BF16 is lossy for smaller values and we always prioritize stability. See <a href="https://github.com/microsoft/DeepSpeed/issues/1773">this DeepSpeed issue</a> for more information.</p>
|
456 |
</div>
|
457 |
</div>
|
458 |
|