PocketDoc
/

Dans-PersonalityEngine-V1.3.0-24b

@@ -133,11 +133,11 @@ library_name: transformers
                             range of tasks and domains.
                         </p>
                         <p>
-                            This model series has been meticulously fine-tuned
-                            on a diverse corpus of 40+ specialized datasets to
-                            excel at both creative endeavors like roleplay and
-                            co-writing, and technical tasks such as code
-                            generation, tool use, and complex reasoning.
                         </p>
                         <p>
                             The total dataset size is around 1.7B tokens, 1.1B
@@ -146,7 +146,9 @@ library_name: transformers
                         <p>
                             V1.3.0 introduces multilingual capabilities with
                             support for 10 languages and enhanced domain
-                            expertise across multiple fields.
                         </p>
                         <h3>Multilingual Support</h3>
                         <pre class="code-block">
@@ -158,19 +160,35 @@ Hindi   Japanese  Korean   Portuguese  Spanish</pre
 BASE MODEL: mistralai/Mistral-Small-3.1-24B-Base-2503
 LICENSE: apache-2.0
 LANGUAGE: Multilingual with 10 supported languages
-CONTEXT LENGTH: 32768 tokens, 131072 with degraded performance</pre
                         >
                         <h3>Recommended Settings</h3>
                         <pre class="code-block">
 TEMPERATURE: 1.0
 TOP_P: 0.95
-MIN_P: 0.05</pre
                         >
                         <h3>Prompting Format</h3>
                         <p>
                             The model uses the following format I'll refer to as
                             "DanChat-2":
                         </p>
                         <pre class="code-block">
 <|system|>system prompt<|endoftext|><|user|>Hi there!<|endoftext|><|assistant|>Hey, how can I help?<|endoftext|></pre
                         >

                             range of tasks and domains.
                         </p>
                         <p>
+                            Fine-tuned on a diverse corpus of 50+ specialized
+                            datasets, this series excels at both creative
+                            endeavors (like roleplay and co-writing) and
+                            technical tasks (such as code generation, tool use,
+                            and complex reasoning).
                         </p>
                         <p>
                             The total dataset size is around 1.7B tokens, 1.1B
                         <p>
                             V1.3.0 introduces multilingual capabilities with
                             support for 10 languages and enhanced domain
+                            expertise across multiple fields. The primary
+                            language is still English and that is where peak
+                            performance can be expected.
                         </p>
                         <h3>Multilingual Support</h3>
                         <pre class="code-block">
 BASE MODEL: mistralai/Mistral-Small-3.1-24B-Base-2503
 LICENSE: apache-2.0
 LANGUAGE: Multilingual with 10 supported languages
+CONTEXT LENGTH: 32768 tokens, 131072 with degraded quality</pre
                         >
                         <h3>Recommended Settings</h3>
                         <pre class="code-block">
 TEMPERATURE: 1.0
 TOP_P: 0.95
+MIN_P: 0.05
+REPETITION_PENALTY: 1.04</pre
                         >
                         <h3>Prompting Format</h3>
                         <p>
                             The model uses the following format I'll refer to as
                             "DanChat-2":
                         </p>
+                        <h4>Why not ChatML?</h4>
+                        <p>
+                            ChatML is a widely used and standardized format for
+                            LLMs but it has some limitations, using standard
+                            tokens as turn ownership indicators can impart
+                            biases to the model. DanChat-2 uses unique special
+                            tokens for each role, which helps to reduce these
+                            biases and allow the model to more readily adapt to
+                            different roles and tasks. Yes, It is possible to
+                            achieve a similar effect with ChatML, but the
+                            technique to do so would be nonstandard to the
+                            ChatML format and for users who do not use standard
+                            "assistant" and "user" roles, it would fall apart
+                            entirely.
+                        </p>
                         <pre class="code-block">
 <|system|>system prompt<|endoftext|><|user|>Hi there!<|endoftext|><|assistant|>Hey, how can I help?<|endoftext|></pre
                         >