kurogane commited on
Commit
9c32e23
·
verified ·
1 Parent(s): b348c0d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -11
README.md CHANGED
@@ -122,20 +122,41 @@ print(json_text)
122
 
123
  ### Training Procedure
124
 
125
- <!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
126
-
127
- #### Preprocessing [optional]
128
-
129
- [More Information Needed]
130
-
131
 
132
  #### Training Hyperparameters
133
 
134
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
135
-
136
- #### Speeds, Sizes, Times [optional]
137
-
138
- <!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
  [More Information Needed]
141
 
 
122
 
123
  ### Training Procedure
124
 
125
+ Unslothにぶち込みました。a6000でだいたい35分位かかりました。
 
 
 
 
 
126
 
127
  #### Training Hyperparameters
128
 
129
+ ```
130
+ from trl import SFTTrainer
131
+ from transformers import TrainingArguments
132
+ from unsloth import is_bfloat16_supported
133
+
134
+ trainer = SFTTrainer(
135
+ model = model,
136
+ tokenizer = tokenizer,
137
+ train_dataset = dataset,
138
+ dataset_text_field = "text",
139
+ max_seq_length = max_seq_length,
140
+ dataset_num_proc = 2,
141
+ packing = False, # Can make training 5x faster for short sequences.
142
+ args = TrainingArguments(
143
+ per_device_train_batch_size = 8,
144
+ gradient_accumulation_steps = 8,
145
+ warmup_steps = 5,
146
+ num_train_epochs = 1, # Set this for 1 full training run.
147
+ # max_steps = 60,
148
+ learning_rate = 2e-4,
149
+ fp16 = not is_bfloat16_supported(),
150
+ bf16 = is_bfloat16_supported(),
151
+ logging_steps = 1,
152
+ optim = "adamw_8bit",
153
+ weight_decay = 0.01,
154
+ lr_scheduler_type = "linear",
155
+ seed = 3407,
156
+ output_dir = "outputs",
157
+ ),
158
+ )
159
+ ```
160
 
161
  [More Information Needed]
162