如果基于sglang 部署如何支持 思考/非思考模式切换?
#28 opened 2 days ago
by
verigle
Скарты
#27 opened 6 days ago
by
artartarta
Upload Cadient Revenue Radar 2026.xlsx
#26 opened 8 days ago
by
basisakai
Question: Why are the definitions related to max-model-len in config.json and tokenizer_config.json inconsistent?
#25 opened 9 days ago
by
foyoux
Request: DOI
#24 opened 10 days ago
by
xtolxy1
Is it possible to run inference on an A100 GPU?
2
#23 opened 12 days ago
by
Tony664
3.2 Exp 32b or distilled Qwen ?
#22 opened 17 days ago
by
guizpublic
DeepSeek-V3.2 全方位最新实测出炉(300+维度),欢迎进群交流讨论~
#17 opened 23 days ago
by
JEIN
Question about long-context evaluation in DeepSeek-V3.2-Exp
1
#15 opened 25 days ago
by
fcMpKYz6Avp5QK
国庆deepwork
➕
🤗
5
#14 opened 25 days ago
by
fengyujian
能不能一直保留旧版的deepseek v3.1的API接口?
❤️
👍
3
7
#10 opened 26 days ago
by
lixin4sky
Full Coverage Video of V3.2 - Step by Step
👍
2
#9 opened 26 days ago
by
fahdmirzac
The whale is back
❤️
7
#8 opened 26 days ago
by
Nechintosh
How Much VRAM ?
5
#7 opened 26 days ago
by
Ni3SinghR
Transformers does not recognize this architecture
6
#6 opened 26 days ago
by
eva20150932-atlascloud
Context length
3
#5 opened 26 days ago
by
cheflee668
咱这个模型是非得国庆前更新吗??
😔
👍
111
31
#1 opened 26 days ago
by
luckjone