zerofata commited on
Commit
7333d57
·
verified ·
1 Parent(s): b482ab4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +10 -6
README.md CHANGED
@@ -150,6 +150,7 @@ a:hover {text-decoration: underline;}
150
  <p>This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing.</p>
151
  </div>
152
  </div>
 
153
  <div class="section-container">
154
  <div class="section-header">
155
  <div class="section-indicator"></div>
@@ -181,6 +182,7 @@ a:hover {text-decoration: underline;}
181
  </div>
182
  </div>
183
  </div>
 
184
  <div class="section-container">
185
  <div class="section-header">
186
  <div class="section-indicator"></div>
@@ -203,6 +205,7 @@ a:hover {text-decoration: underline;}
203
  </div>
204
  </div>
205
  </div>
 
206
  <div class="section-container">
207
  <div class="section-header">
208
  <div class="section-indicator"></div>
@@ -211,8 +214,10 @@ a:hover {text-decoration: underline;}
211
  <div class="section-content">
212
  <p>The model first went through SFT with a small synthetic dataset of 2.9 million tokens, approximately 750 conversations. Primarily RP data with small amounts of random instruct / assistant data and creative writing.</p>
213
  <p>The model then went through DPO training using approx 1100 chosen examples from the SFT dataset that were of exceptional quality or showed verifiable instruction following. Rejected samples were generated using another Llama 3.3 finetune that is known for poor instruction following.</p>
 
 
214
  <h3 class="subheading">SFT 1*H200</h3>
215
- ```yml
216
  # ====================
217
  # MODEL CONFIGURATION
218
  # ====================
@@ -322,10 +327,9 @@ save_safetensors: true
322
  wandb_project: project_name
323
  # wandb_entity: your_entity # Uncomment and set if needed
324
  # wandb_name: your_run_name # Uncomment and set if needed
325
- ```
326
-
327
- <h3 class="subheading">DPO 2*H200</h3>
328
- ```yml
329
  # ====================
330
  # MODEL CONFIGURATION
331
  # ====================
@@ -421,7 +425,7 @@ save_safetensors: true
421
  wandb_project: project_name
422
  # wandb_entity: your_entity # Uncomment and set if needed
423
  # wandb_name: your_run_name # Uncomment and set if needed
424
- ```
425
  </div>
426
  </div>
427
  </div>
 
150
  <p>This is a creative model intended to excel at character driven RP / ERP. It has not been tested or trained on adventure stories or any large amounts of creative writing.</p>
151
  </div>
152
  </div>
153
+
154
  <div class="section-container">
155
  <div class="section-header">
156
  <div class="section-indicator"></div>
 
182
  </div>
183
  </div>
184
  </div>
185
+
186
  <div class="section-container">
187
  <div class="section-header">
188
  <div class="section-indicator"></div>
 
205
  </div>
206
  </div>
207
  </div>
208
+
209
  <div class="section-container">
210
  <div class="section-header">
211
  <div class="section-indicator"></div>
 
214
  <div class="section-content">
215
  <p>The model first went through SFT with a small synthetic dataset of 2.9 million tokens, approximately 750 conversations. Primarily RP data with small amounts of random instruct / assistant data and creative writing.</p>
216
  <p>The model then went through DPO training using approx 1100 chosen examples from the SFT dataset that were of exceptional quality or showed verifiable instruction following. Rejected samples were generated using another Llama 3.3 finetune that is known for poor instruction following.</p>
217
+ <h3 class="subheading">Axolotl configs</h3>
218
+ <p>Neither are optimized for cost / performance efficiency, YMMV.</p>
219
  <h3 class="subheading">SFT 1*H200</h3>
220
+ <pre><code>
221
  # ====================
222
  # MODEL CONFIGURATION
223
  # ====================
 
327
  wandb_project: project_name
328
  # wandb_entity: your_entity # Uncomment and set if needed
329
  # wandb_name: your_run_name # Uncomment and set if needed
330
+ </code></pre>
331
+ <h3 class="subheading">DPO 2*H200</h3>
332
+ <pre><code>
 
333
  # ====================
334
  # MODEL CONFIGURATION
335
  # ====================
 
425
  wandb_project: project_name
426
  # wandb_entity: your_entity # Uncomment and set if needed
427
  # wandb_name: your_run_name # Uncomment and set if needed
428
+ </code></pre>
429
  </div>
430
  </div>
431
  </div>