Qwen/Qwen3-32B · RuntimeError: probability tensor contains either `inf`, `nan` or element

When running the code snippet provided in the model card, I get the following error:

RuntimeError: probability tensor contains either inf, nan or element < 0

I can fix this error, by setting do_sample = False in the model.generate call. However, this leads to me getting garbage outputs (I changed the max_new_tokens to 10, otherwise it just generates garbage endlessly):

thinking content:
content: ivate!!!!!!!!Le

When I print out the model config, I see that the vocab size is 151936. The tokenizer has a vocab size of 151643, and then there are special tokens, which mean that len(tokenizer) returns 151669. I believe that the runtime error I was getting is explained by this discrepancy between the vocab size of the model and the number of tokens in the tokenizer. I am a little confused as to what to do / how to debug this.

I am using transformers 4.52.4.

Qwen
/

Qwen3-32B

RuntimeError: probability tensor contains either `inf`, `nan` or element < 0