mergesloppa123123 commited on
Commit
47c1895
·
verified ·
1 Parent(s): 0bbba9d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -21,7 +21,7 @@ That said, it does feel unique and fun to use. If you're the type of person who'
21
  ChatML
22
 
23
  ## Samplers
24
- Because stack merges introduce some unexpected noise to the model, I recommend higher min p than normal. I've been getting good results with min_p 0.1 -> temp 1 (I usually prefer something like min_p 0.03-0.05 -> temp 0.7-0.9, adjust according to taste). Add your favorite anti-repetition sampler as needed.
25
 
26
  ### Configuration
27
 
@@ -70,7 +70,7 @@ idtype: bfloat16
70
  tokenizer_source: ../Hermes-3-Llama-3.1-70B
71
  ```
72
 
73
- This is an
74
 
75
  ---
76
 
 
21
  ChatML
22
 
23
  ## Samplers
24
+ Because stack merges introduce some unexpected noise to the model, I recommend higher min p than normal. I've been getting good results with min_p 0.09-0.11 -> temp 0.8-1.0, add your favorite anti-repetition sampler as needed.
25
 
26
  ### Configuration
27
 
 
70
  tokenizer_source: ../Hermes-3-Llama-3.1-70B
71
  ```
72
 
73
+ In the first few iterations I tried merging the tokenizers in an attempt to support both ChatML and L3, but it ended up breaking both of them. Also tried lower and higher slerp ratios but this seems like the sweet spot.
74
 
75
  ---
76