merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using CultriX/SeQwence-14Bv1 as a base.

Models Merged

The following models were included in the merge:

Configuration

The following YAML configuration was used to produce this model:

models:
  - model: CultriX/SeQwence-14Bv1
    parameters:
      weight: 0.22        # Boosted slightly to improve general task performance
      density: 0.62       # Prioritize generalist adaptability
  - model: allknowingroger/QwenSlerp6-14B
    parameters:
      weight: 0.18
      density: 0.59       # Slight increase to enhance contextual reasoning (tinyHellaswag)
  - model: CultriX/Qwen2.5-14B-Wernickev3
    parameters:
      weight: 0.16
      density: 0.56       # Minor increase to stabilize GPQA and MUSR performance
  - model: CultriX/Qwen2.5-14B-Emergedv3
    parameters:
      weight: 0.15        # Increase weight for domain-specific expertise
      density: 0.55
  - model: VAGOsolutions/SauerkrautLM-v2-14b-DPO
    parameters:
      weight: 0.12
      density: 0.56       # Enhance factual reasoning and IFEval contributions
  - model: CultriX/Qwen2.5-14B-Unity
    parameters:
      weight: 0.10
      density: 0.53
  - model: qingy2019/Qwen2.5-Math-14B-Instruct
    parameters:
      weight: 0.10
      density: 0.51       # Retain focus on MATH and advanced reasoning tasks

merge_method: dare_ties
base_model: CultriX/SeQwence-14Bv1
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16
tokenizer_source: Qwen/Qwen2.5-14B-Instruct

adaptive_merge_parameters:
  task_weights:
    IFEval: 1.5           # Strengthened for better instruction-following
    BBH: 1.3
    MATH: 1.6             # Emphasize advanced reasoning and problem-solving
    GPQA: 1.4             # Improve factual recall and logical QA tasks
    MUSR: 1.5             # Strengthened multi-step reasoning capabilities
    MMLU-PRO: 1.3         # Slight boost for domain-specific multitask knowledge
  smoothing_factor: 0.19   # Refined for smoother blending of task strengths
gradient_clipping: 0.88    # Tightened slightly for precise parameter contribution
Downloads last month
124
Safetensors
Model size
14.8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for CultriX/Qwen2.5-14B-FinalMerge

Space using CultriX/Qwen2.5-14B-FinalMerge 1