Update description1.md
Browse files- description1.md +4 -2
description1.md
CHANGED
|
@@ -6,9 +6,11 @@ From our findings, we need approximately 1/3 memory under ideal conditions (F, B
|
|
| 6 |
|
| 7 |
Check out our paper at [Arxiv](https://arxiv.org/abs/2405.15362).
|
| 8 |
|
|
|
|
| 9 |
| Comparison assuming T_F=T_B=T_W | 1F1B | V-Min | V-Half | V-ZB |
|
| 10 |
| ----------------------------------------------------- | ------- |------- | ---------- | ---- |
|
| 11 |
-
| Bubble Rate | (p-1)/m |
|
| 12 |
-
| Activation Memory <br> (Compared to 1F1B) | p | (p+4)
|
|
|
|
| 13 |
|
| 14 |
Bubble Rate here is calculated as (1 - longest stage time/(F+B+W)/m).
|
|
|
|
| 6 |
|
| 7 |
Check out our paper at [Arxiv](https://arxiv.org/abs/2405.15362).
|
| 8 |
|
| 9 |
+
|
| 10 |
| Comparison assuming T_F=T_B=T_W | 1F1B | V-Min | V-Half | V-ZB |
|
| 11 |
| ----------------------------------------------------- | ------- |------- | ---------- | ---- |
|
| 12 |
+
| Bubble Rate | (p-1)/m | ~ 2p/3m | ~ p/ 2m | 0 |
|
| 13 |
+
| Activation Memory <br> (Compared to 1F1B) | p | (p+4)/3 | (p+2)/2 | p |
|
| 14 |
+
|
| 15 |
|
| 16 |
Bubble Rate here is calculated as (1 - longest stage time/(F+B+W)/m).
|