Qwen/Qwen3-32B · Discussions

When can we see the base model for Qwen3 32B?

#41 opened about 1 month ago by

arpitsh018

Will Qwen3-32B be updated just like Qwen3-235B-A22B?

1

#40 opened about 1 month ago by

Enigrand

JSON formatting errors in Function Call scenarios

#39 opened about 2 months ago by

liopen

qweb1

#38 opened about 2 months ago by

MaxLin-un

MoE version with the same performance as this 32B dense

#37 opened about 2 months ago by

rtzurtz

Low Score on GSM8K on lm-eval-harness? (just 74.91)

2

#36 opened 2 months ago by

jonny-vr

Transformers version for qwen3-32B-8FP

#35 opened 3 months ago by

as-il

Where is the Base Model?

👍 ➕ 7

#34 opened 3 months ago by

jonny-vr

I want to use this model to run my code

#33 opened 3 months ago by

yunxi0827

Qwen3-32B-Base?

👍 8

2

#32 opened 3 months ago by

canac84073

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

#31 opened 3 months ago by

VMoorjani

Add assistant mask support to Qwen3-32B

#30 opened 3 months ago by

waleko

Setting 'enable_thinking=False' has no effect.

1

#29 opened 4 months ago by

ktrocks

finetune question

1

#28 opened 4 months ago by

Saicy

Qwen3ForCausalLM - Architecture issue

1

#26 opened 4 months ago by

cr-gkn

Request to Release the Base Model for Qwen3-32B

➕ 👍 14

#25 opened 4 months ago by

eramax

How to control thinking length?

➕ 9

2

#24 opened 4 months ago by

lidh15

Qwen3 does not deploy on Endpoints

#23 opened 4 months ago by

zenfiric

The model's instructions follow too poorly

➕ 1

3

#22 opened 4 months ago by

xldistance

Update README.md

#21 opened 4 months ago by

Logical-Transcendence84

please release AWQ version

#20 opened 4 months ago by

classdemo

Collections of Bad Cases User Reviews and Comments of Qwen3 32B model

#19 opened 4 months ago by

DeepNLP

Potential issue with large context sizes - can someone confirm?

15

#18 opened 4 months ago by

Thireus

Qwen 3 presence of tools affect output length?

#17 opened 4 months ago by

evetsagg

"/no_think" control is unstable

1

#16 opened 4 months ago by

Smorty100

LICENSE files missing

👍 1

1

#14 opened 4 months ago by

johndoe2001

After setting /nothinking or enable_thinking=False, can the empty <thinking> tag be omitted from the response?

👍 3

2

#13 opened 4 months ago by

pteromyini

Feedback: It's a good model, however it hallucinates very badly at local facts (Germany)

👍 😔 11

2

#12 opened 4 months ago by

Dampfinchen

The correct way of fine-tuning on multi-turn trajectories

👍 11

2

#11 opened 4 months ago by

hr0nix

Providing a GPTQ version

👀 3

12

#10 opened 4 months ago by

blueteamqq1

how to set, enable_thinking=False, on ollama

👍 6

2

#9 opened 4 months ago by

TatsuhiroC

🚀[Fine-tuning] Implementation and Best Practices for Qwen3 CPT/SFT/DPO/GRPO Training👋

🚀 🔥 3

#7 opened 4 months ago by

study-hjt

Reasoning or Non-reasoning model?

4

#6 opened 4 months ago by

dipta007

Local Installation Video and Testing - Step by Step

#5 opened 4 months ago by

fahdmirzac

【Evaluation】Best practice for evaluating Qwen3 !!

🚀 🔥 5

#4 opened 4 months ago by

wangxingjun778

Base Model?

😔 ➕ 8

13

#3 opened 4 months ago by

Downtown-Case

Is this multimodal?

😔 1

1

#2 opened 4 months ago by

pbarker

Add languages tag

👍 2

#1 opened 4 months ago by

de-francophones