merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Benchmarks

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
tinyBenchmarks	N/A
- tinyArc	0	none	25	acc_norm	↑	0.5911	±	N/A
- tinyGSM8k	0	flexible-extract	5	exact_match	↑	0.1913	±	N/A
		strict-match	5	exact_match	↑	0.2107	±	N/A
- tinyHellaswag	0	none	10	acc_norm	↑	0.6603	±	N/A
- tinyMMLU	0	none	0	acc_norm	↑	0.6937	±	N/A
- tinyTruthfulQA	0	none	0	acc	↑	0.5596	±	N/A
- tinyWinogrande	0	none	5	acc_norm	↑	0.6752	±	N/A

Merge Method

This model was merged using the Task Arithmetic merge method using Qwen/Qwen2.5-7B-Instruct as a base.

Models Merged

The following models were included in the merge:

Qwen/Qwen2.5-Math-7B-Instruct

Configuration

The following YAML configuration was used to produce this model:

slices:
  - sources:
      - model: Qwen/Qwen2.5-7B-Instruct
        layer_range: [0, 7]
      - model: Qwen/Qwen2.5-Math-7B-Instruct
        layer_range: [0, 7]
    parameters:
      weight: [0, 1]
      lambda: 0

  - sources:
      - model: Qwen/Qwen2.5-7B-Instruct
        layer_range: [7, 14]
      - model: Qwen/Qwen2.5-Math-7B-Instruct
        layer_range: [7, 14]
    parameters:
      weight: [0, 1]
      lambda: 0.25

  - sources:
      - model: Qwen/Qwen2.5-7B-Instruct
        layer_range: [14, 21]
      - model: Qwen/Qwen2.5-Math-7B-Instruct
        layer_range: [14, 21]
    parameters:
      weight: [0, 1]
      lambda: 0.5

  - sources:
      - model: Qwen/Qwen2.5-7B-Instruct
        layer_range: [21, 28]
      - model: Qwen/Qwen2.5-Math-7B-Instruct
        layer_range: [21, 28]
    parameters:
      weight: [0, 1]
      lambda: 0.75

merge_method: task_arithmetic
base_model: Qwen/Qwen2.5-7B-Instruct
dtype: float16

Downloads last month: 9

Safetensors

Model size

7.62B params

Tensor type

F16

Model tree for igomsmiranda/qwen2.5-7b_instruct-math7b_task_arithmetic

Qwen/Qwen2.5-7B-Instruct

Qwen/Qwen2.5-Math-7B-Instruct

Merge model

this model