Update README.md
Browse files
README.md
CHANGED
@@ -90,7 +90,7 @@ More details about the evaluation setup and the new Norwegian benchmarks will be
|
|
90 |
|
91 |
**Training Details:**
|
92 |
- Training tokens: 250 billion
|
93 |
-
- Batch size: 1,024 × 4,096 tokens
|
94 |
- Training steps: 60,000
|
95 |
- Peak learning rate: 1e-4
|
96 |
- Warm-up steps: 1,000
|
|
|
90 |
|
91 |
**Training Details:**
|
92 |
- Training tokens: 250 billion
|
93 |
+
- Batch size: 1,024 × 4,096 tokens (# sequences × sequence length)
|
94 |
- Training steps: 60,000
|
95 |
- Peak learning rate: 1e-4
|
96 |
- Warm-up steps: 1,000
|