The missing "base model" of Qwen3-32B. This model serves as the foundation for our R1-0528 distillation work.

This model is the result of continued pre-training on Qwen3-32B, using a multilingual dataset of mixed code and text.

The purpose of training this model is to provide a model that is close to a "pre-trained" state, reducing the influence of the original Qwen3's linguistic style on subsequent fine-tuning efforts.

We are providing this model to the community to serve as a base model for further SFT, this model is not intended for direct inference.

Downloads last month
9
Safetensors
Model size
32.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for OpenBuddy/OpenBuddy-Qwen3-Base-v26

Base model

Qwen/Qwen3-32B
Finetuned
(47)
this model