MoM-Gated-Deltanet-340M / README.md

Update README.md

5da59e5 verified 22 days ago

310 Bytes

metadata

license: apache-2.0
datasets:
  - cerebras/SlimPajama-627B
language:
  - en

The model was trained on a sample of SlimPajama with 15B tokens. We use Gated-Deltanet as the memory update mechanism.