Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Jackmin108
/
Moonlight-16B-A3B-Instruct-Fast
like
1
Text Generation
Transformers
Safetensors
deepseek_v3
conversational
custom_code
text-generation-inference
arxiv:
2502.16982
License:
mit
Model card
Files
Files and versions
xet
Community
Train
Deploy
Use this model
main
Moonlight-16B-A3B-Instruct-Fast
Ctrl+K
Ctrl+K
1 contributor
History:
4 commits
Jackmin108
update model weights
9ff0b27
about 1 month ago
figures
original files
about 1 month ago
.gitattributes
Safe
1.56 kB
original files
about 1 month ago
README.md
Safe
8.36 kB
original files
about 1 month ago
config.json
Safe
1.38 kB
original files
about 1 month ago
configuration_deepseek.py
Safe
10.7 kB
original files
about 1 month ago
generation_config.json
Safe
61 Bytes
original files
about 1 month ago
model-00001-of-00010.safetensors
1.34 GB
xet
update model weights
about 1 month ago
model-00002-of-00010.safetensors
2.51 GB
xet
update model weights
about 1 month ago
model-00003-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00004-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00005-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00006-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00007-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00008-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00009-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model-00010-of-00010.safetensors
3.51 GB
xet
update model weights
about 1 month ago
model.safetensors.index.json
33.6 kB
update model weights
about 1 month ago
modeling_deepseek.py
76.4 kB
use torchtitan moe impl
about 1 month ago
tiktoken.model
Safe
2.8 MB
xet
original files
about 1 month ago
tokenization_moonshot.py
Safe
11.1 kB
original files
about 1 month ago
tokenizer_config.json
Safe
2.66 kB
original files
about 1 month ago