Text Generation
Transformers
Safetensors
mistral
conversational
text-generation-inference
zerofata commited on
Commit
af1ea8a
·
verified ·
1 Parent(s): 224b88b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -386,7 +386,7 @@ base_model:
386
  <div class="section-content">
387
  <p>Creation process: Upscale > Pretrain > SFT > DPO</p>
388
  <p>All training was qlora (including pretrain).</p>
389
- <p>Pretrained on 177MB of data. Dataset consisteted mostly of Light Novels, NSFW stories, SFW stories and filled out with general corpos text from Huggingface FineWeb-2 dataset.</p>
390
  <p>The model then went through SFT using a dataset of approx 3.6 million tokens, 700 RP conversations, 1000 creative writing / instruct samples and about 100 summaries. The bulk of this data has been made public.</p>
391
  <p>Finally, DPO was used to make the model more consistent.</p>
392
  <div class="dropdown-container">
 
386
  <div class="section-content">
387
  <p>Creation process: Upscale > Pretrain > SFT > DPO</p>
388
  <p>All training was qlora (including pretrain).</p>
389
+ <p>Pretrained on 177MB of data. Dataset consisteted mostly of Light Novels, NSFW stories, SFW stories and filled out with general corpus text from Huggingface FineWeb-2 dataset.</p>
390
  <p>The model then went through SFT using a dataset of approx 3.6 million tokens, 700 RP conversations, 1000 creative writing / instruct samples and about 100 summaries. The bulk of this data has been made public.</p>
391
  <p>Finally, DPO was used to make the model more consistent.</p>
392
  <div class="dropdown-container">