Update README.md
Browse files
README.md
CHANGED
@@ -1,3 +1,11 @@
|
|
1 |
-
---
|
2 |
-
license: apache-2.0
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
datasets:
|
4 |
+
- cerebras/SlimPajama-627B
|
5 |
+
language:
|
6 |
+
- en
|
7 |
+
---
|
8 |
+
|
9 |
+
Model of the paper [MoM: Linear Sequence Modeling with Mixture-of-Memories](https://arxiv.org/abs/2502.13685) and [Gated Delta Networks: Improving Mamba2 with Delta Rule](https://arxiv.org/abs/2412.06464).
|
10 |
+
|
11 |
+
The model was trained on a sample of SlimPajama with 15B tokens.
|