Update README.md
Browse files
README.md
CHANGED
@@ -24,4 +24,6 @@ We finetuned the `wte` and `wpe` layers of GPT-2 (while freezing the parameters
|
|
24 |
- evaluation_strategy: "steps"
|
25 |
- max_eval_samples: 5000
|
26 |
```
|
27 |
-
**Training details**: total training steps: 457000, effective train batch size per step: 32, max tokens per batch: 1024)
|
|
|
|
|
|
24 |
- evaluation_strategy: "steps"
|
25 |
- max_eval_samples: 5000
|
26 |
```
|
27 |
+
**Training details**: total training steps: 457000, effective train batch size per step: 32, max tokens per batch: 1024)
|
28 |
+
|
29 |
+
**Final checkpoint**: checkpoint-457000
|