Text Generation
GGUF
English
mixture of experts
Mixture of Experts
8x3B
Llama 3.2 MOE
128k context
creative
creative writing
fiction writing
plot generation
sub-plot generation
story generation
scene continue
storytelling
fiction story
science fiction
romance
all genres
story
writing
vivid prosing
vivid writing
fiction
roleplaying
bfloat16
swearing
rp
horror
mergekit
conversational
Update README.md
Browse files
README.md
CHANGED
@@ -93,10 +93,6 @@ https://huggingface.co/huihui-ai/Llama-3.2-3B-Instruct-abliterated
|
|
93 |
|
94 |
The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.
|
95 |
|
96 |
-
You can set the number of experts in LMStudio (https://lmstudio.ai) at the "load" screen and via other apps/llm apps by setting "Experts" or "Number of Experts".
|
97 |
-
|
98 |
-
When using "API", you set the "num_experts_used" in the JSON payload (this maybe different for different back ends).
|
99 |
-
|
100 |
This "team" has a Captain (first listed model), and then all the team members contribute to the to "token"
|
101 |
choice billions of times per second. Note the Captain also contributes too.
|
102 |
|
@@ -110,6 +106,19 @@ That means the power of every model is available during instruction and output g
|
|
110 |
|
111 |
This brings unparalleled power to all forms of generation and all use cases.
|
112 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
113 |
CREDITS:
|
114 |
|
115 |
Please visit each repo above to see what model(s) contributed to each of models above.
|
|
|
93 |
|
94 |
The mixture of experts is set at 2 experts, but you can use 3,4,5,6.. 7 and even 8.
|
95 |
|
|
|
|
|
|
|
|
|
96 |
This "team" has a Captain (first listed model), and then all the team members contribute to the to "token"
|
97 |
choice billions of times per second. Note the Captain also contributes too.
|
98 |
|
|
|
106 |
|
107 |
This brings unparalleled power to all forms of generation and all use cases.
|
108 |
|
109 |
+
CHANGING THE NUMBER OF EXPERTS:
|
110 |
+
|
111 |
+
You can set the number of experts in LMStudio (https://lmstudio.ai) at the "load" screen and via other apps/llm apps by setting "Experts" or "Number of Experts".
|
112 |
+
|
113 |
+
For Text-Generation-Webui (https://github.com/oobabooga/text-generation-webui) you set the number of experts at the loading screen page.
|
114 |
+
|
115 |
+
For server.exe / Llama-server.exe (Llamacpp - https://github.com/ggerganov/llama.cpp/blob/master/examples/server/README.md )
|
116 |
+
add the following (CLI): "--override-kv llama.expert_used_count=int:6"
|
117 |
+
|
118 |
+
(no quotes, where "6" is the number of experts to use)
|
119 |
+
|
120 |
+
When using "API", you set the "num_experts_used" in the JSON payload (this maybe different for different back ends).
|
121 |
+
|
122 |
CREDITS:
|
123 |
|
124 |
Please visit each repo above to see what model(s) contributed to each of models above.
|