Files changed (1) hide show
  1. README.md +80 -67
README.md CHANGED
@@ -1,67 +1,80 @@
1
- ---
2
- base_model:
3
- - CultriX/Qwen2.5-14B-MegaMerge-pt2
4
- - qingy2019/Qwen2.5-Math-14B-Instruct
5
- - CultriX/SeQwence-14B
6
- - v000000/Qwen2.5-Lumen-14B
7
- - arcee-ai/Virtuoso-Small
8
- - Qwen/Qwen2.5-14B
9
- library_name: transformers
10
- tags:
11
- - mergekit
12
- - merge
13
-
14
- ---
15
- # merge
16
-
17
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
18
-
19
- ## Merge Details
20
- ### Merge Method
21
-
22
- This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
23
-
24
- ### Models Merged
25
-
26
- The following models were included in the merge:
27
- * [CultriX/Qwen2.5-14B-MegaMerge-pt2](https://huggingface.co/CultriX/Qwen2.5-14B-MegaMerge-pt2)
28
- * [qingy2019/Qwen2.5-Math-14B-Instruct](https://huggingface.co/qingy2019/Qwen2.5-Math-14B-Instruct)
29
- * [CultriX/SeQwence-14B](https://huggingface.co/CultriX/SeQwence-14B)
30
- * [v000000/Qwen2.5-Lumen-14B](https://huggingface.co/v000000/Qwen2.5-Lumen-14B)
31
- * [arcee-ai/Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small)
32
-
33
- ### Configuration
34
-
35
- The following YAML configuration was used to produce this model:
36
-
37
- ```yaml
38
- models:
39
- - model: qingy2019/Qwen2.5-Math-14B-Instruct
40
- parameters:
41
- weight: 0.35
42
- density: 0.6
43
- - model: arcee-ai/Virtuoso-Small
44
- parameters:
45
- weight: 0.30
46
- density: 0.6
47
- - model: CultriX/Qwen2.5-14B-MegaMerge-pt2
48
- parameters:
49
- weight: 0.20
50
- density: 0.5
51
- - model: CultriX/SeQwence-14B
52
- parameters:
53
- weight: 0.15
54
- density: 0.4
55
- - model: v000000/Qwen2.5-Lumen-14B
56
- parameters:
57
- weight: 0.10
58
- density: 0.5
59
- base_model: Qwen/Qwen2.5-14B
60
- merge_method: dare_ties
61
- parameters:
62
- normalize: true
63
- int8_mask: true
64
- dtype: bfloat16
65
- tokenizer_source: Qwen/Qwen2.5-14B-Instruct
66
-
67
- ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - CultriX/Qwen2.5-14B-MegaMerge-pt2
4
+ - qingy2019/Qwen2.5-Math-14B-Instruct
5
+ - CultriX/SeQwence-14B
6
+ - v000000/Qwen2.5-Lumen-14B
7
+ - arcee-ai/Virtuoso-Small
8
+ - Qwen/Qwen2.5-14B
9
+ library_name: transformers
10
+ tags:
11
+ - mergekit
12
+ - merge
13
+ language:
14
+ - zho
15
+ - eng
16
+ - fra
17
+ - spa
18
+ - por
19
+ - deu
20
+ - ita
21
+ - rus
22
+ - jpn
23
+ - kor
24
+ - vie
25
+ - tha
26
+ - ara
27
+ ---
28
+ # merge
29
+
30
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
31
+
32
+ ## Merge Details
33
+ ### Merge Method
34
+
35
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base.
36
+
37
+ ### Models Merged
38
+
39
+ The following models were included in the merge:
40
+ * [CultriX/Qwen2.5-14B-MegaMerge-pt2](https://huggingface.co/CultriX/Qwen2.5-14B-MegaMerge-pt2)
41
+ * [qingy2019/Qwen2.5-Math-14B-Instruct](https://huggingface.co/qingy2019/Qwen2.5-Math-14B-Instruct)
42
+ * [CultriX/SeQwence-14B](https://huggingface.co/CultriX/SeQwence-14B)
43
+ * [v000000/Qwen2.5-Lumen-14B](https://huggingface.co/v000000/Qwen2.5-Lumen-14B)
44
+ * [arcee-ai/Virtuoso-Small](https://huggingface.co/arcee-ai/Virtuoso-Small)
45
+
46
+ ### Configuration
47
+
48
+ The following YAML configuration was used to produce this model:
49
+
50
+ ```yaml
51
+ models:
52
+ - model: qingy2019/Qwen2.5-Math-14B-Instruct
53
+ parameters:
54
+ weight: 0.35
55
+ density: 0.6
56
+ - model: arcee-ai/Virtuoso-Small
57
+ parameters:
58
+ weight: 0.30
59
+ density: 0.6
60
+ - model: CultriX/Qwen2.5-14B-MegaMerge-pt2
61
+ parameters:
62
+ weight: 0.20
63
+ density: 0.5
64
+ - model: CultriX/SeQwence-14B
65
+ parameters:
66
+ weight: 0.15
67
+ density: 0.4
68
+ - model: v000000/Qwen2.5-Lumen-14B
69
+ parameters:
70
+ weight: 0.10
71
+ density: 0.5
72
+ base_model: Qwen/Qwen2.5-14B
73
+ merge_method: dare_ties
74
+ parameters:
75
+ normalize: true
76
+ int8_mask: true
77
+ dtype: bfloat16
78
+ tokenizer_source: Qwen/Qwen2.5-14B-Instruct
79
+
80
+ ```