Text Generation
Transformers
Safetensors
cohere2
conversational
jukofyork commited on
Commit
145d5ab
·
verified ·
1 Parent(s): 939ef66

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -120,7 +120,7 @@ using ~200M tokens (ie: ~100M positive and ~100M negative) from:
120
  - [jukofyork/instruction-responses-500MB](https://huggingface.co/datasets/jukofyork/instruction-responses-500MB)
121
  - [jukofyork/instruction-refusals-500MB](https://huggingface.co/datasets/jukofyork/instruction-refusals-500MB)
122
 
123
- taking just under 4 days:
124
 
125
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/bOMISzLsmjimXDZ2k72Mu.png)
126
 
@@ -128,6 +128,8 @@ taking just under 4 days:
128
 
129
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/7BJJK6UKRDCgqEqEIwlkA.png)
130
 
 
 
131
  ---
132
 
133
  The control adapter was then converted to a LoRA using [control_adapter_to_lora.py](https://github.com/jukofyork/qlora-pipe-lite/blob/main/control_adapter_to_lora.py):
 
120
  - [jukofyork/instruction-responses-500MB](https://huggingface.co/datasets/jukofyork/instruction-responses-500MB)
121
  - [jukofyork/instruction-refusals-500MB](https://huggingface.co/datasets/jukofyork/instruction-refusals-500MB)
122
 
123
+ taking just under 4 days using 6x `RTX A6000` over 3 machines:
124
 
125
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/bOMISzLsmjimXDZ2k72Mu.png)
126
 
 
128
 
129
  ![image/png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/7BJJK6UKRDCgqEqEIwlkA.png)
130
 
131
+ (hence the 30 batch size: `(num_gpus / pipeline_stages) * gradient_accumulation_steps = (6 / 2) * 10 = 30`)
132
+
133
  ---
134
 
135
  The control adapter was then converted to a LoRA using [control_adapter_to_lora.py](https://github.com/jukofyork/qlora-pipe-lite/blob/main/control_adapter_to_lora.py):