T145 commited on
Commit
dc1049f
·
verified ·
1 Parent(s): 64c9a2b

Updated inference recommendations

Browse files
Files changed (1) hide show
  1. README.md +4 -8
README.md CHANGED
@@ -177,18 +177,14 @@ Based on the listed rankings as of 4/12/24, is the top-rank 8B model.
177
 
178
  # Inference Settings
179
 
180
- Personal recommendations are:
181
 
182
  ```
183
- mirostat = 2
184
- mirostat_eta = 0.1
185
- mirostat_tau = 4.24
186
  num_ctx = 4096
187
- repeat_penalty = 1.4
188
  temperature = 0.85
189
- seed = 42
190
  top_k = 0
191
- top_p = 0.95
192
  ```
193
 
194
- After cross-referencing [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W) and [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1).
 
177
 
178
  # Inference Settings
179
 
180
+ Personal recommendations are to use an [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
181
 
182
  ```
 
 
 
183
  num_ctx = 4096
184
+ repeat_penalty = 1.2
185
  temperature = 0.85
 
186
  top_k = 0
187
+ top_p = 1
188
  ```
189
 
190
+ Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W) and [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1).