Text Generation
Transformers
PyTorch
English
mamba
bf16
16bit
File size: 897 Bytes
6246127
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
---
license: apache-2.0
language:
- en
pipeline_tag: text-generation
inference: true
tags:
- mamba
- bf16
- 16bit
datasets:
- cerebras/SlimPajama-627B
---
# Mamba 2.8b Slim Pyjama - bf16 (16-bit)

This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co/state-spaces/mamba-2.8b-slimpj/)

Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset. 

Model code: https://github.com/state-spaces/mamba/tree/main

To load the model, follow the installation instruction in the code repo, and then:
```
from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
```

## Inference Notebook (Colab)
- [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing)