Qwen 3 4b merges
Collection
6 items
โข
Updated
This is a merge of 4 Qwen pre-trained language models created using mergekit.
This model aims for general reasoning by mergeing a few Qwen 3 4b model that is trained multiple reasoning datasets
This model was merged using the Linear merge method using ertghiu256/qwen3-multi-reasoner as a base.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
base_model: ertghiu256/qwen3-multi-reasoner
dtype: float16
merge_method: linear
modules:
default:
slices:
- sources:
- layer_range: [0, 36]
model: ertghiu256/qwen3-multi-reasoner
parameters:
weight: 0.7
- layer_range: [0, 36]
model: ertghiu256/qwen-3-4b-mixture-of-thought
parameters:
weight: 0.9
- layer_range: [0, 36]
model: ertghiu256/qwen3-4b-code-reasoning
parameters:
weight: 0.8
- layer_range: [0, 36]
model: ertghiu256/qwen3-math-reasoner
parameters:
weight: 0.6
parameters:
int8_mask: 1.0
normalize: 1.0
4-bit
8-bit