--- base_model: - Qwen/QwQ-32B library_name: transformers tags: - mergekit - merge --- # QwQ-32B Kumo This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: slerp base_model: Qwen/QwQ-32B models: - model: Qwen/QwQ-32B - model: NovaSky-AI/Sky-T1-32B-Flash parameters: t: 0.4 dtype: bfloat16 name: merge_model_1 --- merge_method: breadcrumbs_ties base_model: Qwen/QwQ-32B tokenizer_source: Qwen/QwQ-32B name: merge_model_2 models: - model: Qwen/QwQ-32B parameters: weight: 1.0 - model: FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview parameters: weight: 0.75 dtype: bfloat16 --- merge_method: task_arithmetic base_model: Qwen/Qwen2.5-32B tokenizer_source: Qwen/QwQ-32B name: merge_model_3 models: - model: rinna/deepseek-r1-distill-qwen2.5-bakeneko-32b parameters: weight: 1.0 - model: cyberagent/DeepSeek-R1-Distill-Qwen-32B-Japanese parameters: weight: 0.9 tokenizer_source: base dtype: bfloat16 --- merge_method: slerp base_model: Qwen/QwQ-32B models: - model: Qwen/QwQ-32B - model: TeamDelta/ABEJA-Qwen2.5-32B-base-jp-v0.1 parameters: t: 0.5 tokenizer_source: base dtype: bfloat16 name: merge_model_4 --- merge_method: model_stock base_model: Qwen/QwQ-32B models: - model: Qwen/QwQ-32B - model: merge_model_1 - model: merge_model_2 - model: merge_model_3 - model: merge_model_4 dtype: bfloat16 pad_to_multiple_of: 512 tokenizer_source: base name: QwQ-32B-Kumo ```