The model was trained on a sample of SlimPajama with 15B tokens.

Due to changes in the MLP layer structure in the latest version of fla, the weights cannot be loaded. You can use the version at fla instead.

Safetensors

Model size

374M params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train linear-moe-hub/RetNet-340M

Collection including linear-moe-hub/RetNet-340M