Parameters / Experts - How to run this model ;
#16 opened 2 months ago
by
DavidAU

DeepSeek R1 0528?
#15 opened 3 months ago
by
Thireus

This model almost completely loses Chinese ablities
π
1
3
#14 opened 3 months ago
by
CHNtentes
Base version?
β
3
2
#13 opened 4 months ago
by
ToastyPigeon

Russian language is missing
1
#12 opened 4 months ago
by
Kosh69
Please, share the custom vLLM source you made
π
1
#11 opened 4 months ago
by
hyunw55
Update metadata π€
#10 opened 4 months ago
by
merve

Model seems to not be performing correctly
1
#9 opened 4 months ago
by
daniel-ltw
Larger model?
π§
2
#8 opened 4 months ago
by
blobbybob

number of experts +
π₯
π§
2
#7 opened 4 months ago
by
Danioken
Brainstorming
π§
5
5
#6 opened 4 months ago
by
Downtown-Case
Further training/distillation needed?
π
1
1
#5 opened 4 months ago
by
mingyi456
Besides pruning..
6
#4 opened 4 months ago
by
Lockout

Context size? YaRN still supported?
2
#3 opened 4 months ago
by
Thireus

Variants
#2 opened 4 months ago
by
someone13574
code
β
17
#1 opened 4 months ago
by
mrfakename
