|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
pipeline_tag: text-generation |
|
inference: true |
|
tags: |
|
- mamba |
|
- bf16 |
|
- 16bit |
|
datasets: |
|
- cerebras/SlimPajama-627B |
|
--- |
|
# Mamba 2.8b Slim Pyjama - bf16 (16-bit) |
|
|
|
This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co/state-spaces/mamba-2.8b-slimpj/) |
|
|
|
Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset. |
|
|
|
Model code: https://github.com/state-spaces/mamba/tree/main |
|
|
|
To load the model, follow the installation instruction in the code repo, and then: |
|
``` |
|
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel |
|
model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj") |
|
``` |
|
|
|
## Inference Notebook (Colab) |
|
- [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing) |