Updated inference recommendations
Browse files
README.md
CHANGED
@@ -177,18 +177,14 @@ Based on the listed rankings as of 4/12/24, is the top-rank 8B model.
|
|
177 |
|
178 |
# Inference Settings
|
179 |
|
180 |
-
Personal recommendations are:
|
181 |
|
182 |
```
|
183 |
-
mirostat = 2
|
184 |
-
mirostat_eta = 0.1
|
185 |
-
mirostat_tau = 4.24
|
186 |
num_ctx = 4096
|
187 |
-
repeat_penalty = 1.
|
188 |
temperature = 0.85
|
189 |
-
seed = 42
|
190 |
top_k = 0
|
191 |
-
top_p =
|
192 |
```
|
193 |
|
194 |
-
|
|
|
177 |
|
178 |
# Inference Settings
|
179 |
|
180 |
+
Personal recommendations are to use an [i1-Q4_K_M](https://www.reddit.com/r/LocalLLaMA/comments/1ck76rk/weightedimatrix_vs_static_quants/) quant with these settings:
|
181 |
|
182 |
```
|
|
|
|
|
|
|
183 |
num_ctx = 4096
|
184 |
+
repeat_penalty = 1.2
|
185 |
temperature = 0.85
|
|
|
186 |
top_k = 0
|
187 |
+
top_p = 1
|
188 |
```
|
189 |
|
190 |
+
Other recommendations can be found on [this paper on mobile LLMs](https://openreview.net/pdf?id=ahVsd1hy2W) and [this paper on balancing model parameters](https://arxiv.org/html/2408.13586v1).
|