Update README.md
Browse files
README.md
CHANGED
@@ -129,19 +129,10 @@ library_name: transformers
|
|
129 |
</div>
|
130 |
<p>
|
131 |
Dans-PersonalityEngine is a versatile model series
|
132 |
-
|
133 |
-
|
134 |
-
|
135 |
-
|
136 |
-
Fine-tuned on a diverse corpus of 50+ specialized
|
137 |
-
datasets, this series excels at both creative
|
138 |
-
endeavors (like roleplay and co-writing) and
|
139 |
-
technical tasks (such as code generation, tool use,
|
140 |
-
and complex reasoning).
|
141 |
-
</p>
|
142 |
-
<p>
|
143 |
-
The total dataset size is around 1.7B tokens, 1.1B
|
144 |
-
of which is creative and 600M being technical.
|
145 |
</p>
|
146 |
<p>
|
147 |
V1.3.0 introduces multilingual capabilities with
|
@@ -152,22 +143,22 @@ library_name: transformers
|
|
152 |
</p>
|
153 |
<h3>Multilingual Support</h3>
|
154 |
<pre class="code-block">
|
155 |
-
Arabic
|
156 |
-
Hindi
|
|
|
|
|
157 |
>
|
158 |
<h3>Key Details</h3>
|
159 |
<pre class="code-block">
|
160 |
BASE MODEL: mistralai/Mistral-Small-3.1-24B-Base-2503
|
161 |
LICENSE: apache-2.0
|
162 |
LANGUAGE: Multilingual with 10 supported languages
|
163 |
-
CONTEXT LENGTH: 32768 tokens, 131072 with degraded
|
164 |
>
|
165 |
<h3>Recommended Settings</h3>
|
166 |
<pre class="code-block">
|
167 |
TEMPERATURE: 1.0
|
168 |
-
TOP_P: 0.
|
169 |
-
MIN_P: 0.05
|
170 |
-
REPETITION_PENALTY: 1.04</pre
|
171 |
>
|
172 |
<h3>Prompting Format</h3>
|
173 |
<p>
|
@@ -180,11 +171,9 @@ REPETITION_PENALTY: 1.04</pre
|
|
180 |
<h3>Why not ChatML?</h3>
|
181 |
<p>
|
182 |
While ChatML is a standard format for LLMs, it has
|
183 |
-
limitations. DanChat-2 uses
|
184 |
for each role, reducing biases and helping the model
|
185 |
-
adapt to different tasks more readily.
|
186 |
-
effects could be achieved with modified ChatML, such
|
187 |
-
approaches come with their own caveats.
|
188 |
</p>
|
189 |
<h3>SillyTavern Template</h3>
|
190 |
<p>
|
|
|
129 |
</div>
|
130 |
<p>
|
131 |
Dans-PersonalityEngine is a versatile model series
|
132 |
+
fine-tuned on 50+ specialized datasets, designed to
|
133 |
+
excel at both creative tasks (like roleplay and
|
134 |
+
co-writing) and technical challenges (such as code
|
135 |
+
generation, tool use, and complex reasoning).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
136 |
</p>
|
137 |
<p>
|
138 |
V1.3.0 introduces multilingual capabilities with
|
|
|
143 |
</p>
|
144 |
<h3>Multilingual Support</h3>
|
145 |
<pre class="code-block">
|
146 |
+
Arabic Chinese English
|
147 |
+
French German Hindi
|
148 |
+
Japanese Korean Portuguese
|
149 |
+
Spanish</pre
|
150 |
>
|
151 |
<h3>Key Details</h3>
|
152 |
<pre class="code-block">
|
153 |
BASE MODEL: mistralai/Mistral-Small-3.1-24B-Base-2503
|
154 |
LICENSE: apache-2.0
|
155 |
LANGUAGE: Multilingual with 10 supported languages
|
156 |
+
CONTEXT LENGTH: 32768 tokens, 131072 with degraded recall</pre
|
157 |
>
|
158 |
<h3>Recommended Settings</h3>
|
159 |
<pre class="code-block">
|
160 |
TEMPERATURE: 1.0
|
161 |
+
TOP_P: 0.9</pre
|
|
|
|
|
162 |
>
|
163 |
<h3>Prompting Format</h3>
|
164 |
<p>
|
|
|
171 |
<h3>Why not ChatML?</h3>
|
172 |
<p>
|
173 |
While ChatML is a standard format for LLMs, it has
|
174 |
+
limitations. DanChat-2 uses special tokens
|
175 |
for each role, reducing biases and helping the model
|
176 |
+
adapt to different tasks more readily.
|
|
|
|
|
177 |
</p>
|
178 |
<h3>SillyTavern Template</h3>
|
179 |
<p>
|