Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

kalomaze
/
Qwen3-16B-A3B

Safetensors
qwen3_moe
Model card Files Files and versions
xet
Community
16
New discussion
Resources
  • PR & discussions documentation
  • Code of Conduct
  • Hub documentation

Parameters / Experts - How to run this model ;

#16 opened 2 months ago by
DavidAU

DeepSeek R1 0528?

#15 opened 3 months ago by
Thireus

This model almost completely loses Chinese ablities

πŸ‘ 1
3
#14 opened 3 months ago by
CHNtentes

Base version?

βž• 3
2
#13 opened 4 months ago by
ToastyPigeon

Russian language is missing

1
#12 opened 4 months ago by
Kosh69

Please, share the custom vLLM source you made

πŸ‘€ 1
#11 opened 4 months ago by
hyunw55

Update metadata πŸ€—

#10 opened 4 months ago by
merve

Model seems to not be performing correctly

1
#9 opened 4 months ago by
daniel-ltw

Larger model?

🧠 2
#8 opened 4 months ago by
blobbybob

number of experts +

πŸ”₯ 🧠 2
#7 opened 4 months ago by
Danioken

Brainstorming

🧠 5
5
#6 opened 4 months ago by
Downtown-Case

Further training/distillation needed?

πŸ‘€ 1
1
#5 opened 4 months ago by
mingyi456

Besides pruning..

6
#4 opened 4 months ago by
Lockout

Context size? YaRN still supported?

2
#3 opened 4 months ago by
Thireus

Variants

#2 opened 4 months ago by
someone13574

code

βž• 17
#1 opened 4 months ago by
mrfakename
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs