Update README.md
Browse files
README.md
CHANGED
@@ -6,6 +6,12 @@ datasets:
|
|
6 |
|
7 |
RWKV-7 trained on the Pile w/ "20b tokenizer" (332115325534 tokens)
|
8 |
|
|
|
|
|
|
|
|
|
|
|
|
|
9 |
Check https://github.com/BlinkDL/RWKV-LM for details.
|
10 |
|
11 |
How to run it:
|
|
|
6 |
|
7 |
RWKV-7 trained on the Pile w/ "20b tokenizer" (332115325534 tokens)
|
8 |
|
9 |
+
0.1B = L12-D768, lr 8e-4 to 3e-5 cosine decay, wd 0.1, bsz 8x30x4096
|
10 |
+
|
11 |
+
0.4B = L24-D1024, lr 6e-4 to 2e-5 cosine decay, wd 0.1, bsz 8x30x4096
|
12 |
+
|
13 |
+
1.5B = L24-D2048, lr 5e-4 to 1.5e-5 cosine decay, wd 0.1, bsz 8x45x4096
|
14 |
+
|
15 |
Check https://github.com/BlinkDL/RWKV-LM for details.
|
16 |
|
17 |
How to run it:
|