Two of the base models are missing

#17
by ZhangRC - opened

Two of the base models are missing:

Qwen3-235B-A22B-Base
Qwen3-32B-Base

Since the blog post announcing Qwen3 release mentioned Qwen3-235B-A22B-Base's results on various benchmarks, I sincerely urge that the two base models be released.

This is very beneficial for these aspects:

  1. Reproducibility of benchmark results
  2. Validation of scaling law
  3. Fine-tuning on a different domain from math, reasoning and code, such as novels

The exclusion of Qwen3-32B-Base and Qwen3-235B-A22B-Base makes the release feel inconsistent. A complete and consistent release would both express sincerity and better benefit the community.

Patience, young Padawan

Your need to confirm your account before you can post a new comment.

Sign up or log in to comment