Trelis
/

mamba-2.8b-slimpj-bf16

Text Generation

Model card Files Files and versions Community

mamba-2.8b-slimpj-bf16 / README.md

RonanMcGovern's picture

Create README.md

6246127 verified over 1 year ago

|

history blame contribute delete

897 Bytes

	---
	license: apache-2.0
	language:
	- en
	pipeline_tag: text-generation
	inference: true
	tags:
	- mamba
	- bf16
	- 16bit
	datasets:
	- cerebras/SlimPajama-627B
	---
	# Mamba 2.8b Slim Pyjama - bf16 (16-bit)

	This is a 16 bit version of [Mamba-2.8b-slimpj](https://huggingface.co/state-spaces/mamba-2.8b-slimpj/)

	Mamba-2.8b-slimpj is a model using the [Mamba](https://arxiv.org/abs/2312.00752) architecture, with 2.8B parameters, trained for 600B tokens on the SlimPajama dataset.

	Model code: https://github.com/state-spaces/mamba/tree/main

	To load the model, follow the installation instruction in the code repo, and then:
	```
	from mamba_ssm.models.mixer_seq_simple import MambaLMHeadModel
	model = MambaLMHeadModel.from_pretrained("state-spaces/mamba-2.8b-slimpj")
	```

	## Inference Notebook (Colab)
	- [Notebook here](https://colab.research.google.com/drive/1GsDbbkDTDpia_Dc8s-7bwEn_GrpkBVO4?usp=sharing)