gbyuvd
/

drugtargetpred-chemselfies

Text Classification

Inference Endpoints

Model card Files Files and versions Community

gbyuvd commited on Jul 31, 2024

Commit

cb0f7cb

·

verified ·

1 Parent(s): 5a4b149

Update README.md

Files changed (1) hide show

README.md +7 -3

README.md CHANGED Viewed

@@ -244,7 +244,7 @@ Data Preprocessing
 - Batch size = 128
-- Num of Epoch= 36
 I am using Ranger21 optimizer with these settings:
@@ -252,10 +252,11 @@ I am using Ranger21 optimizer with these settings:
 Core optimizer = madgrad
 Learning rate of 1.5e-05
-num_epochs of training = ** 1 epochs **
 using AdaBelief for variance computation
-Warm-up: linear warmup, over 964 iterations (0.22)
 Lookahead active, merging every 5 steps, with blend factor of 0.5
 Norm Loss active, factor = 0.0001
@@ -265,6 +266,9 @@ Gradient Centralization = On
 Adaptive Gradient Clipping = True
 	clipping value of 0.01
 	steps for clipping = 0.001
 ```
 I turned off the warm down, since in prior experiments it led to instability of losses in my case.

 - Batch size = 128
+- Num of Epoch= 36 (10, 12, 14; separate run - based on training dynamics, it seems training with one run with lots of epochs is better than doing separate run like this.)
 I am using Ranger21 optimizer with these settings:
 Core optimizer = madgrad
 Learning rate of 1.5e-05
+Important - num_epochs of training = ** _(10, 12, 14; separate run)_ epochs **
+please confirm this is correct or warmup and warmdown will be off
 using AdaBelief for variance computation
+Warm-up: linear warmup, over 2000 iterations
 Lookahead active, merging every 5 steps, with blend factor of 0.5
 Norm Loss active, factor = 0.0001
 Adaptive Gradient Clipping = True
 	clipping value of 0.01
 	steps for clipping = 0.001
+params size saved
+total param groups = 1
+total params in groups = 137
 ```
 I turned off the warm down, since in prior experiments it led to instability of losses in my case.