Lambent
/

qwen2.5-14B-selfmerge-A

@@ -1,53 +1,66 @@
----
-base_model:
-- Qwen/Qwen2.5-14B-Instruct
-- Qwen/Qwen2.5-14B
-library_name: transformers
-tags:
-- mergekit
-- merge
----
-# qwenselfbaseinstruct
-This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
-Re-injected base model into instruct model in the intermediate layers while keeping input and output layers the same (sophosympatheia gradient).
-While this did degrade the overall score of the model compared to instruct in EQ-bench testing (76.9195 down to 73.8068),
-it removed its issue with misspelling some of the emotion responses and remains notably higher than the base model
-(60.1027 but without any syntax errors).
-It did throw in one non-mispelled "didn't match reference" syntax error, I presume it replaced the emotion entirely or used a similar grammatically correct one.
-Looking at this as research evidence, it seems like the instruct model picked up something hurting the spelling occasionally specifically in the intermediate layers?
-I don't know if there's any other gain from this merge compared to using one or both components, this was for curiosity.
-Might still be useful as more-compact merge materials if you wanted both base and instruct anyway.
-## Merge Details
-### Merge Method
-This model was merged using the SLERP merge method.
-### Models Merged
-The following models were included in the merge:
-* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
-* [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B)
-### Configuration
-The following YAML configuration was used to produce this model:
-```yaml
-models:
-  - model: Qwen/Qwen2.5-14B
-merge_method: slerp
-base_model: Qwen/Qwen2.5-14B-Instruct
-parameters:
-  t:
-    - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0]
-dtype: bfloat16
-```

+---
+base_model:
+- Qwen/Qwen2.5-14B-Instruct
+- Qwen/Qwen2.5-14B
+library_name: transformers
+tags:
+- mergekit
+- merge
+language:
+- zho
+- eng
+- fra
+- spa
+- por
+- deu
+- ita
+- rus
+- jpn
+- kor
+- vie
+- tha
+- ara
+---
+# qwenselfbaseinstruct
+This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+Re-injected base model into instruct model in the intermediate layers while keeping input and output layers the same (sophosympatheia gradient).
+While this did degrade the overall score of the model compared to instruct in EQ-bench testing (76.9195 down to 73.8068),
+it removed its issue with misspelling some of the emotion responses and remains notably higher than the base model
+(60.1027 but without any syntax errors).
+It did throw in one non-mispelled "didn't match reference" syntax error, I presume it replaced the emotion entirely or used a similar grammatically correct one.
+Looking at this as research evidence, it seems like the instruct model picked up something hurting the spelling occasionally specifically in the intermediate layers?
+I don't know if there's any other gain from this merge compared to using one or both components, this was for curiosity.
+Might still be useful as more-compact merge materials if you wanted both base and instruct anyway.
+## Merge Details
+### Merge Method
+This model was merged using the SLERP merge method.
+### Models Merged
+The following models were included in the merge:
+* [Qwen/Qwen2.5-14B-Instruct](https://huggingface.co/Qwen/Qwen2.5-14B-Instruct)
+* [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B)
+### Configuration
+The following YAML configuration was used to produce this model:
+```yaml
+models:
+  - model: Qwen/Qwen2.5-14B
+merge_method: slerp
+base_model: Qwen/Qwen2.5-14B-Instruct
+parameters:
+  t:
+    - value: [0, 0, 0.3, 0.4, 0.5, 0.6, 0.5, 0.4, 0.3, 0, 0]
+dtype: bfloat16
+```