Update README.md
Browse files
README.md
CHANGED
@@ -244,7 +244,7 @@ Data Preprocessing
|
|
244 |
|
245 |
|
246 |
- Batch size = 128
|
247 |
-
- Num of Epoch= 36
|
248 |
|
249 |
I am using Ranger21 optimizer with these settings:
|
250 |
|
@@ -252,10 +252,11 @@ I am using Ranger21 optimizer with these settings:
|
|
252 |
Core optimizer = madgrad
|
253 |
Learning rate of 1.5e-05
|
254 |
|
255 |
-
num_epochs of training = **
|
|
|
256 |
|
257 |
using AdaBelief for variance computation
|
258 |
-
Warm-up: linear warmup, over
|
259 |
|
260 |
Lookahead active, merging every 5 steps, with blend factor of 0.5
|
261 |
Norm Loss active, factor = 0.0001
|
@@ -265,6 +266,9 @@ Gradient Centralization = On
|
|
265 |
Adaptive Gradient Clipping = True
|
266 |
clipping value of 0.01
|
267 |
steps for clipping = 0.001
|
|
|
|
|
|
|
268 |
```
|
269 |
|
270 |
I turned off the warm down, since in prior experiments it led to instability of losses in my case.
|
|
|
244 |
|
245 |
|
246 |
- Batch size = 128
|
247 |
+
- Num of Epoch= 36 (10, 12, 14; separate run - based on training dynamics, it seems training with one run with lots of epochs is better than doing separate run like this.)
|
248 |
|
249 |
I am using Ranger21 optimizer with these settings:
|
250 |
|
|
|
252 |
Core optimizer = madgrad
|
253 |
Learning rate of 1.5e-05
|
254 |
|
255 |
+
Important - num_epochs of training = ** _(10, 12, 14; separate run)_ epochs **
|
256 |
+
please confirm this is correct or warmup and warmdown will be off
|
257 |
|
258 |
using AdaBelief for variance computation
|
259 |
+
Warm-up: linear warmup, over 2000 iterations
|
260 |
|
261 |
Lookahead active, merging every 5 steps, with blend factor of 0.5
|
262 |
Norm Loss active, factor = 0.0001
|
|
|
266 |
Adaptive Gradient Clipping = True
|
267 |
clipping value of 0.01
|
268 |
steps for clipping = 0.001
|
269 |
+
params size saved
|
270 |
+
total param groups = 1
|
271 |
+
total params in groups = 137
|
272 |
```
|
273 |
|
274 |
I turned off the warm down, since in prior experiments it led to instability of losses in my case.
|