DavidAU commited on
Commit
a37a3d2
·
verified ·
1 Parent(s): a6c59aa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -4
README.md CHANGED
@@ -93,10 +93,6 @@ https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated
93
 
94
  The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.
95
 
96
- You can set the number of experts in LMStudio (https://lmstudio.ai) at the "load" screen and via other apps/llm apps by setting "Experts" or "Number of Experts".
97
-
98
- When using "API", you set the "num_experts_used" in the JSON payload (this maybe different for different back ends).
99
-
100
  This "team" has a Captain (first listed model), and then all the team members contribute to the to "token"
101
  choice billions of times per second. Note the Captain also contributes too.
102
 
@@ -110,6 +106,19 @@ That means the power of every model is available during instruction and output g
110
 
111
  This brings unparalleled power to all forms of generation and all use cases.
112
 
 
 
 
 
 
 
 
 
 
 
 
 
 
113
  CREDITS:
114
 
115
  Please visit each repo above to see what model(s) contributed to each of models above.
 
93
 
94
  The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.
95
 
 
 
 
 
96
  This "team" has a Captain (first listed model), and then all the team members contribute to the to "token"
97
  choice billions of times per second. Note the Captain also contributes too.
98
 
 
106
 
107
  This brings unparalleled power to all forms of generation and all use cases.
108
 
109
+ CHANGING THE NUMBER OF EXPERTS:
110
+
111
+ You can set the number of experts in LMStudio (https://lmstudio.ai) at the "load" screen and via other apps/llm apps by setting "Experts" or "Number of Experts".
112
+
113
+ For Text-Generation-Webui (https://github.com/oobabooga/text-generation-webui) you set the number of experts at the loading screen page.
114
+
115
+ For server.exe / Llama-server.exe (Llamacpp - https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md )
116
+ add the following (CLI): "--override-kv llama.expert_used_count=int:6"
117
+
118
+ (no quotes, where "6" is the number of experts to use)
119
+
120
+ When using "API", you set the "num_experts_used" in the JSON payload (this maybe different for different back ends).
121
+
122
  CREDITS:
123
 
124
  Please visit each repo above to see what model(s) contributed to each of models above.