Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
fxmeng
/
TransMLA-llama-2-7b-r64-n512-norm
like
0
Text Generation
Safetensors
fxmeng/transmla_pretrain_6B_tokens
English
deepseek_v3
custom_code
License:
apache-2.0
Model card
Files
Files and versions
Community
README.md exists but content is empty.
Downloads last month
5
Safetensors
Model size
6.14B params
Tensor type
BF16
·
Files info
Inference Providers
NEW
Text Generation
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for
fxmeng/TransMLA-llama-2-7b-r64-n512-norm
Base model
meta-llama/Llama-2-7b-hf
Finetuned
(
1092
)
this model
Dataset used to train
fxmeng/TransMLA-llama-2-7b-r64-n512-norm
fxmeng/transmla_pretrain_6B_tokens
Viewer
•
Updated
17 days ago
•
5.94M
•
146
Collection including
fxmeng/TransMLA-llama-2-7b-r64-n512-norm
TransMLA-base
Collection
Base Model for TransMLA
•
5 items
•
Updated
Jun 15