mergesloppa123123 commited on
Commit
0bbba9d
·
verified ·
1 Parent(s): 95a4835

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -12
README.md CHANGED
@@ -1,25 +1,27 @@
1
  ---
2
- base_model: []
 
 
 
3
  library_name: transformers
4
  tags:
5
  - mergekit
6
  - merge
7
-
8
  ---
9
- # hanamixv3
10
 
11
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 
12
 
13
- ## Merge Details
14
- ### Merge Method
15
 
16
- This model was merged using the SLERP merge method.
17
 
18
- ### Models Merged
 
19
 
20
- The following models were included in the merge:
21
- * ../L3.1-70B-Hanami-x1
22
- * ../Hermes-3-Llama-3.1-70B
23
 
24
  ### Configuration
25
 
@@ -66,5 +68,12 @@ merge_method: slerp
66
  base_model: ../Hermes-3-Llama-3.1-70B
67
  idtype: bfloat16
68
  tokenizer_source: ../Hermes-3-Llama-3.1-70B
69
-
70
  ```
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: cc-by-nc-4.0
3
+ base_model:
4
+ - NousResearch/Hermes-3-Llama-3.1-70B
5
+ - Sao10K/L3.1-70B-Hanami-x1
6
  library_name: transformers
7
  tags:
8
  - mergekit
9
  - merge
 
10
  ---
 
11
 
12
+ # Hanames-90B-L3.1
13
+ It's a stack merge meme model made from Hermes 3 and Hanami-x1. Uses a similar formula to my previous stack merge, but updates to Hanami-x1 and includes some mild slerping of the slices. Coherence seems to be improved as a result while remaining fun to use. You should use it for roleplay and creative writing AND PROBABLY NOTHING ELSE (but hey, it's your funeral)
14
 
15
+ ## STACK MERGE DISCLAIMER
16
+ yes it's just a stack merge, no I didn't do any additional pretraining, no stack merges don't make the model smarter, yes they harm its ability to do complex logical tasks, yes they introduce some weird behaviors and unexpected mistakes, no they don't make the model sentient, no you shouldn't post on twitter about how adding a few layers turned it into agi, etc. etc.
17
 
18
+ That said, it does feel unique and fun to use. If you're the type of person who's drowning in VRAM would rather have some more variety at the expense of needing to make a few manual edits to clean up mistakes, give it a try.
19
 
20
+ ## Format
21
+ ChatML
22
 
23
+ ## Samplers
24
+ Because stack merges introduce some unexpected noise to the model, I recommend higher min p than normal. I've been getting good results with min_p 0.1 -> temp 1 (I usually prefer something like min_p 0.03-0.05 -> temp 0.7-0.9, adjust according to taste). Add your favorite anti-repetition sampler as needed.
 
25
 
26
  ### Configuration
27
 
 
68
  base_model: ../Hermes-3-Llama-3.1-70B
69
  idtype: bfloat16
70
  tokenizer_source: ../Hermes-3-Llama-3.1-70B
 
71
  ```
72
+
73
+ This is an
74
+
75
+ ---
76
+
77
+ All credit goes to the original finetuners, I'm just some dummy who can write mergekit configs.
78
+
79
+ :*