Update README.md
Browse files
README.md
CHANGED
@@ -91,12 +91,12 @@ This model used weights pretrained by [lxj616](https://huggingface.co/lxj616/mak
|
|
91 |
* **Image size:** 512 x 512
|
92 |
* **Frame count:** 24
|
93 |
* **Schedule:**
|
94 |
-
* 2 x 10 epochs: LR warmup for
|
95 |
-
* 2 x 20 epochs: LR warmup for
|
96 |
* 1 x 9 epochs: LR warmup for 1 epoch to 5e-5 then cosine annealing to 1e-8
|
97 |
* Additional data mixed in, see [Trainig Data](#training-data)
|
98 |
-
* 1 x 5 epochs: LR warmup for
|
99 |
-
* 1 x 5 epochs: LR warmup for 0.
|
100 |
* some restarts were required due to NaNs appearing in the gradient (see training logs)
|
101 |
* **Total update steps:** ~200,000
|
102 |
* **Hardware:** 4 x TPUv4 (provided by Google Cloud for the [HuggingFace JAX/Diffusers Sprint Event](https://github.com/huggingface/community-events/tree/main/jax-controlnet-sprint))
|
|
|
91 |
* **Image size:** 512 x 512
|
92 |
* **Frame count:** 24
|
93 |
* **Schedule:**
|
94 |
+
* 2 x 10 epochs: LR warmup for 1 epochs then held constant at 5e-5 (10,000 samples per ep)
|
95 |
+
* 2 x 20 epochs: LR warmup for 1 epochs then held constant at 5e-5 (10,000 samples per ep)
|
96 |
* 1 x 9 epochs: LR warmup for 1 epoch to 5e-5 then cosine annealing to 1e-8
|
97 |
* Additional data mixed in, see [Trainig Data](#training-data)
|
98 |
+
* 1 x 5 epochs: LR warmup for 0.5 epochs to 2.5e-5 then constant (17,000 samples per ep)
|
99 |
+
* 1 x 5 epochs: LR warmup for 0.5 epochs to 5e-6 then cosine annealing to 2.5e-6 (17,000 samples per ep)
|
100 |
* some restarts were required due to NaNs appearing in the gradient (see training logs)
|
101 |
* **Total update steps:** ~200,000
|
102 |
* **Hardware:** 4 x TPUv4 (provided by Google Cloud for the [HuggingFace JAX/Diffusers Sprint Event](https://github.com/huggingface/community-events/tree/main/jax-controlnet-sprint))
|