Commit
·
c94e6d5
1
Parent(s):
cb1043c
Update README.md
Browse files
README.md
CHANGED
@@ -52,5 +52,6 @@ The following hyperparameters were used during pre-training:
|
|
52 |
- num_devices: 4
|
53 |
- batch_size: 512
|
54 |
- training_steps: 500,000
|
55 |
-
- encoder
|
56 |
-
-
|
|
|
|
52 |
- num_devices: 4
|
53 |
- batch_size: 512
|
54 |
- training_steps: 500,000
|
55 |
+
- encoder layers: 6
|
56 |
+
- decoder layers: 6
|
57 |
+
- hidden size: 768
|