Update README.md
Browse files
README.md
CHANGED
@@ -141,7 +141,7 @@ library_name: transformers
|
|
141 |
</p>
|
142 |
<p>
|
143 |
The total dataset size is around 1.7B tokens, 1.1B
|
144 |
-
of which
|
145 |
</p>
|
146 |
<p>
|
147 |
V1.3.0 introduces multilingual capabilities with
|
@@ -174,24 +174,20 @@ REPETITION_PENALTY: 1.04</pre
|
|
174 |
The model uses the following format I'll refer to as
|
175 |
"DanChat-2":
|
176 |
</p>
|
177 |
-
<h4>Why not ChatML?</h4>
|
178 |
-
<p>
|
179 |
-
ChatML is a widely used and standardized format for
|
180 |
-
LLMs but it has some limitations, using standard
|
181 |
-
tokens as turn ownership indicators can impart
|
182 |
-
biases to the model. DanChat-2 uses unique special
|
183 |
-
tokens for each role, which helps to reduce these
|
184 |
-
biases and allow the model to more readily adapt to
|
185 |
-
different roles and tasks. Yes, It is possible to
|
186 |
-
achieve a similar effect with ChatML, but the
|
187 |
-
technique to do so would be nonstandard to the
|
188 |
-
ChatML format and for users who do not use standard
|
189 |
-
"assistant" and "user" roles, it would fall apart
|
190 |
-
entirely.
|
191 |
-
</p>
|
192 |
<pre class="code-block">
|
193 |
<|system|>system prompt<|endoftext|><|user|>Hi there!<|endoftext|><|assistant|>Hey, how can I help?<|endoftext|></pre
|
194 |
>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
195 |
<h3>SillyTavern Template</h3>
|
196 |
<p>
|
197 |
<a
|
@@ -203,6 +199,20 @@ REPETITION_PENALTY: 1.04</pre
|
|
203 |
Download Master JSON
|
204 |
</a>
|
205 |
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
206 |
<h3>Training Process</h3>
|
207 |
<p>
|
208 |
The model was trained using Axolotl on 8x H100 GPUs
|
@@ -429,25 +439,29 @@ REPETITION_PENALTY: 1.04</pre
|
|
429 |
color: #e49b3e;
|
430 |
overflow-x: auto; /* Added to enable horizontal scrolling */
|
431 |
}
|
432 |
-
.
|
433 |
-
|
434 |
-
|
435 |
-
border: 2px solid #e49b3e;
|
436 |
-
border-radius: 10px;
|
437 |
-
filter: brightness(0.9) sepia(0.2);
|
438 |
-
transition: all 0.3s ease;
|
439 |
}
|
440 |
-
.
|
441 |
-
|
442 |
-
|
|
|
|
|
443 |
border: 2px solid #e49b3e;
|
444 |
border-radius: 5px;
|
445 |
-
|
|
|
|
|
|
|
|
|
446 |
transition: all 0.3s ease;
|
447 |
}
|
448 |
-
.
|
449 |
-
|
450 |
-
box-shadow: 0 0
|
|
|
|
|
451 |
}
|
452 |
</style>
|
453 |
</html>
|
|
|
141 |
</p>
|
142 |
<p>
|
143 |
The total dataset size is around 1.7B tokens, 1.1B
|
144 |
+
of which is creative and 600M being technical.
|
145 |
</p>
|
146 |
<p>
|
147 |
V1.3.0 introduces multilingual capabilities with
|
|
|
174 |
The model uses the following format I'll refer to as
|
175 |
"DanChat-2":
|
176 |
</p>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
177 |
<pre class="code-block">
|
178 |
<|system|>system prompt<|endoftext|><|user|>Hi there!<|endoftext|><|assistant|>Hey, how can I help?<|endoftext|></pre
|
179 |
>
|
180 |
+
<h3>Why not ChatML?</h3>
|
181 |
+
<p>
|
182 |
+
While ChatML is a standard format for LLMs, it has
|
183 |
+
limitations. DanChat-2 uses unique special tokens
|
184 |
+
for each role, reducing biases and helping the model
|
185 |
+
adapt to different tasks better than ChatML's
|
186 |
+
standard tokens. Though similar effects could be
|
187 |
+
achieved with modified ChatML, such approaches would
|
188 |
+
be non-standard and wouldn't work for users with
|
189 |
+
custom roles beyond "assistant" and "user".
|
190 |
+
</p>
|
191 |
<h3>SillyTavern Template</h3>
|
192 |
<p>
|
193 |
<a
|
|
|
199 |
Download Master JSON
|
200 |
</a>
|
201 |
</p>
|
202 |
+
<h3>Inference Provider</h3>
|
203 |
+
This model and others are available from ⚡Mancer AI for
|
204 |
+
those interested in high quality inference without
|
205 |
+
owning or renting expensive hardware.
|
206 |
+
<p class="mancer-button-container">
|
207 |
+
<a
|
208 |
+
href="https://mancer.tech/"
|
209 |
+
target="_blank"
|
210 |
+
rel="noopener noreferrer"
|
211 |
+
class="mancer-button"
|
212 |
+
>
|
213 |
+
⚡mancer
|
214 |
+
</a>
|
215 |
+
</p>
|
216 |
<h3>Training Process</h3>
|
217 |
<p>
|
218 |
The model was trained using Axolotl on 8x H100 GPUs
|
|
|
439 |
color: #e49b3e;
|
440 |
overflow-x: auto; /* Added to enable horizontal scrolling */
|
441 |
}
|
442 |
+
.mancer-button-container {
|
443 |
+
text-align: left;
|
444 |
+
margin: 1em 0;
|
|
|
|
|
|
|
|
|
445 |
}
|
446 |
+
.mancer-button {
|
447 |
+
display: inline-block;
|
448 |
+
background: #1a1a1a;
|
449 |
+
color: #e49b3e;
|
450 |
+
padding: 10px 20px;
|
451 |
border: 2px solid #e49b3e;
|
452 |
border-radius: 5px;
|
453 |
+
font-family: "VT323", monospace;
|
454 |
+
font-size: 18px;
|
455 |
+
text-decoration: none !important; /* Ensures no underline */
|
456 |
+
text-shadow: 0 0 2px #e49b3e;
|
457 |
+
box-shadow: 0 0 10px rgba(228, 155, 62, 0.3);
|
458 |
transition: all 0.3s ease;
|
459 |
}
|
460 |
+
.mancer-button:hover {
|
461 |
+
background: #2a2a2a;
|
462 |
+
box-shadow: 0 0 15px rgba(228, 155, 62, 0.5);
|
463 |
+
text-shadow: 0 0 4px #e49b3e;
|
464 |
+
text-decoration: none !important; /* Ensures no underline on hover */
|
465 |
}
|
466 |
</style>
|
467 |
</html>
|