Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Posts
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

deepseek-ai
/
DeepSeek-V2-Lite

Text Generation
Transformers
Safetensors
deepseek_v2
conversational
custom_code
text-generation-inference
Model card Files Files and versions Community
10
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

The method get_max_length of 'DynamicCache' is deprecated and has been removed in transformer 4.49

#10 opened about 2 months ago by
login256

Fix for missing blank space at the end of chat template.

#9 opened 3 months ago by
ShaneTian

OOM with int4 quant

#8 opened 4 months ago by
chungimungi

I know this is insane but is it possible?

#7 opened 5 months ago by
Assbang

MMLU benchmark performance on math domain

#6 opened 7 months ago by
Fighoture

Use try-except for flash_attn

#5 opened 8 months ago by
LiangliangMa

deepseek-v2-lite模型怎么微调?

1
#2 opened 12 months ago by
guowl
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs