Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

erax-ai
/
EraX-LLaMA3.1-8B-DeepSeekR1-MLA-MoE-Raw

Text Generation
Transformers
Safetensors
llama_deepseek
llama
deepseek
mla
Mixture of Experts
multihead_latent_attention
mixtured_of_experts
blended
Model card Files Files and versions
xet
Community
EraX-LLaMA3.1-8B-DeepSeekR1-MLA-MoE-Raw
Ctrl+K
Ctrl+K
  • 1 contributor
History: 40 commits
erax's picture
erax
Update README.md
31d0f30 verified 4 months ago
  • .gitattributes
    1.52 kB
    initial commit 4 months ago
  • README.md
    7.38 kB
    Update README.md 4 months ago
  • config.json
    1.18 kB
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • generation_config.json
    116 Bytes
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00001-of-00006.safetensors
    4.91 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00002-of-00006.safetensors
    4.99 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00003-of-00006.safetensors
    4.97 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00004-of-00006.safetensors
    4.89 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00005-of-00006.safetensors
    4.98 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model-00006-of-00006.safetensors
    1.6 GB
    xet
    Upload LlamaDeepSeekForCausalLM 4 months ago
  • model.safetensors.index.json
    35.1 kB
    Upload LlamaDeepSeekForCausalLM 4 months ago