JacksonBrune
/

f4ae039f-bc7d-4b14-8ef7-c8a3753e3d98

PEFT

Safetensors

llama

axolotl

Generated from Trainer

Model card Files Files and versions Community

JacksonBrune commited on Feb 25

Commit

92c6589

verified ·

1 Parent(s): d1965ca

End of training

Browse files

Files changed (1) hide show

README.md +0 -143

README.md CHANGED Viewed

@@ -13,101 +13,6 @@ model-index:
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
-<details><summary>See axolotl config</summary>
-axolotl version: `0.4.1`
-```yaml
-adapter: lora
-base_model: unsloth/SmolLM-135M
-bf16: auto
-chat_template: llama3
-dataset_prepared_path: null
-datasets:
-- data_files:
-  - 5b80cb3e3f6cb6d6_train_data.json
-  ds_type: json
-  format: custom
-  path: /workspace/input_data/5b80cb3e3f6cb6d6_train_data.json
-  type:
-    field_input: outline
-    field_instruction: topic
-    field_output: markdown
-    format: '{instruction} {input}'
-    no_input_format: '{instruction}'
-    system_format: '{system}'
-    system_prompt: ''
-debug: null
-deepspeed: null
-device_map: auto
-do_eval: true
-early_stopping_patience: 3
-eval_batch_size: 4
-eval_max_new_tokens: 128
-eval_steps: 150
-eval_table_size: null
-evals_per_epoch: null
-flash_attention: true
-fp16: false
-fsdp: null
-fsdp_config: null
-gradient_accumulation_steps: 4
-gradient_checkpointing: false
-group_by_length: true
-hub_model_id: JacksonBrune/f4ae039f-bc7d-4b14-8ef7-c8a3753e3d98
-hub_strategy: end
-learning_rate: 0.0002
-load_in_4bit: false
-load_in_8bit: false
-local_rank: null
-logging_steps: 50
-lora_alpha: 64
-lora_dropout: 0.05
-lora_fan_in_fan_out: null
-lora_model_dir: null
-lora_r: 32
-lora_target_linear: true
-lr_scheduler: constant
-max_grad_norm: 1.0
-max_memory:
-  0: 75GB
-max_steps: 14997
-micro_batch_size: 4
-mlflow_experiment_name: /tmp/5b80cb3e3f6cb6d6_train_data.json
-model_type: AutoModelForCausalLM
-num_epochs: 10
-optim_args:
-  adam_beta1: 0.9
-  adam_beta2: 0.95
-  adam_epsilon: 1e-5
-optimizer: adamw_bnb_8bit
-output_dir: miner_id_24
-pad_to_sequence_len: true
-resume_from_checkpoint: null
-s2_attention: null
-sample_packing: false
-save_steps: 150
-saves_per_epoch: null
-sequence_len: 512
-strict: false
-tf32: true
-tokenizer_type: AutoTokenizer
-train_on_inputs: false
-trust_remote_code: true
-val_set_size: 0.05
-wandb_entity: null
-wandb_mode: online
-wandb_name: f8675f5c-3a6e-463d-b56c-c54c0872a08f
-wandb_project: SN56-18
-wandb_run: your_name
-wandb_runid: f8675f5c-3a6e-463d-b56c-c54c0872a08f
-warmup_steps: 50
-weight_decay: 0.0
-xformers_attention: null
-```
-</details><br>
 # f4ae039f-bc7d-4b14-8ef7-c8a3753e3d98
@@ -127,54 +32,6 @@ More information needed
 More information needed
-## Training procedure
-### Training hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 0.0002
-- train_batch_size: 4
-- eval_batch_size: 4
-- seed: 42
-- gradient_accumulation_steps: 4
-- total_train_batch_size: 16
-- optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=adam_beta1=0.9,adam_beta2=0.95,adam_epsilon=1e-5
-- lr_scheduler_type: constant
-- lr_scheduler_warmup_steps: 50
-- training_steps: 6739
-### Training results
-| Training Loss | Epoch  | Step | Validation Loss |
-|:-------------:|:------:|:----:|:---------------:|
-| No log        | 0.0015 | 1    | 1.0966          |
-| 0.9341        | 0.2226 | 150  | 0.9216          |
-| 0.9016        | 0.4451 | 300  | 0.8889          |
-| 0.8677        | 0.6677 | 450  | 0.8650          |
-| 0.8513        | 0.8902 | 600  | 0.8484          |
-| 0.8115        | 1.1128 | 750  | 0.8367          |
-| 0.7904        | 1.3353 | 900  | 0.8238          |
-| 0.7913        | 1.5579 | 1050 | 0.8135          |
-| 0.7916        | 1.7804 | 1200 | 0.8034          |
-| 0.7432        | 2.0030 | 1350 | 0.7949          |
-| 0.725         | 2.2255 | 1500 | 0.7911          |
-| 0.7347        | 2.4481 | 1650 | 0.7859          |
-| 0.7437        | 2.6706 | 1800 | 0.7784          |
-| 0.7178        | 2.8932 | 1950 | 0.7716          |
-| 0.677         | 3.1157 | 2100 | 0.7752          |
-| 0.692         | 3.3383 | 2250 | 0.7689          |
-| 0.6876        | 3.5608 | 2400 | 0.7643          |
-| 0.6831        | 3.7834 | 2550 | 0.7587          |
-| 0.6641        | 4.0059 | 2700 | 0.7563          |
-| 0.6437        | 4.2285 | 2850 | 0.7588          |
-| 0.639         | 4.4510 | 3000 | 0.7568          |
-| 0.6522        | 4.6736 | 3150 | 0.7516          |
-| 0.6516        | 4.8961 | 3300 | 0.7486          |
-| 0.605         | 5.1187 | 3450 | 0.7572          |
-| 0.6003        | 5.3412 | 3600 | 0.7551          |
-| 0.6093        | 5.5638 | 3750 | 0.7518          |
 ### Framework versions
 - PEFT 0.13.2

 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
 # f4ae039f-bc7d-4b14-8ef7-c8a3753e3d98
 More information needed
 ### Framework versions
 - PEFT 0.13.2