merge
This is a merge of pre-trained language models created using mergekit.
Merge Details
Merge Method
This model was merged using the della_linear merge method using CultriX/Qwen2.5-14B-Wernickev3 as a base.
Models Merged
The following models were included in the merge:
Configuration
The following YAML configuration was used to produce this model:
merge_method: della_linear
base_model: CultriX/Qwen2.5-14B-Wernickev3
dtype: bfloat16
parameters:
epsilon: 0.03 # Refined for sharper parameter scaling.
lambda: 1.1 # Balances blending while emphasizing significant contributions.
normalize: true # Ensures stable parameter integration across models.
adaptive_merge_parameters:
task_weights:
tinyArc: 1.3 # Logical reasoning enhancement.
tinyHellaswag: 1.2 # Contextual understanding.
tinyMMLU: 1.1 # Domain knowledge retention.
tinyTruthfulQA: 1.4 # Prioritize truthful reasoning tasks.
tinyWinogrande: 1.2 # Contextual reasoning boost.
IFEval: 1.3 # Instruction-following and factual reasoning.
BBH: 1.3 # Complex reasoning support.
MATH: 1.4 # Mathematical problem-solving emphasis.
GPQA: 1.3 # Factual QA improvement.
MUSR: 1.2 # Multi-step reasoning enhancement.
MMLU-PRO: 1.2 # Multitask domain consistency.
smoothing_factor: 0.15 # Balances contributions for smoother integration.
gradient_clipping: 1.0 # Avoids over-contribution from any single model.
models:
- model: CultriX/Qwen2.5-14B-Wernickev3
parameters:
weight: 0.5 # Backbone for multitasking and contextual benchmarks.
density: 0.7 # Retain critical parameters for task-specific optimization.
- model: djuna/Q2.5-Veltha-14B-0.5
parameters:
weight: 0.3 # Complement multitask strengths for IFEval and BBH.
density: 0.8 # High density for consistent parameter integration.
- model: CultriX/SeQwence-14B-EvolMerge
parameters:
weight: 0.2 # Balanced contributor for MUSR and GPQA.
density: 0.6 # Moderate density to preserve diversity without overfitting.
tokenizer_source: CultriX/Qwen2.5-14B-Wernickev3
- Downloads last month
- 62
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for CultriX/Qwen2.5-14B-Broca
Merge model
this model