Two of the base models are missing
#17
by
ZhangRC
- opened
Two of the base models are missing:
Qwen3-235B-A22B-Base
Qwen3-32B-Base
Since the blog post announcing Qwen3 release mentioned Qwen3-235B-A22B-Base
's results on various benchmarks, I sincerely urge that the two base models be released.
This is very beneficial for these aspects:
- Reproducibility of benchmark results
- Validation of scaling law
- Fine-tuning on a different domain from math, reasoning and code, such as novels
The exclusion of Qwen3-32B-Base
and Qwen3-235B-A22B-Base
makes the release feel inconsistent. A complete and consistent release would both express sincerity and better benefit the community.
Patience, young Padawan