mergesloppa123123
/

Hanames-90B-L3.1

Text Generation

text-generation-inference

Model card Files Files and versions Community

mergesloppa123123 commited on Sep 19, 2024

Commit

47c1895

·

verified ·

1 Parent(s): 0bbba9d

Update README.md

Files changed (1) hide show

README.md +2 -2

README.md CHANGED Viewed

@@ -21,7 +21,7 @@ That said, it does feel unique and fun to use. If you're the type of person who'
 ChatML
 ## Samplers
-Because stack merges introduce some unexpected noise to the model, I recommend higher min p than normal. I've been getting good results with min_p 0.1 -> temp 1 (I usually prefer something like min_p 0.03-0.05 -> temp 0.7-0.9, adjust according to taste). Add your favorite anti-repetition sampler as needed.
 ### Configuration
@@ -70,7 +70,7 @@ idtype: bfloat16
 tokenizer_source: ../Hermes-3-Llama-3.1-70B
 ```
-This is an
 ---

 ChatML
 ## Samplers
+Because stack merges introduce some unexpected noise to the model, I recommend higher min p than normal. I've been getting good results with min_p 0.09-0.11 -> temp 0.8-1.0, add your favorite anti-repetition sampler as needed.
 ### Configuration
 tokenizer_source: ../Hermes-3-Llama-3.1-70B
 ```
+In the first few iterations I tried merging the tokenizers in an attempt to support both ChatML and L3, but it ended up breaking both of them. Also tried lower and higher slerp ratios but this seems like the sweet spot.
 ---