afrias5 commited on
Commit
bafaa1c
·
verified ·
1 Parent(s): 2866cf5

Model save

Browse files
Files changed (1) hide show
  1. README.md +157 -0
README.md ADDED
@@ -0,0 +1,157 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ license: llama2
4
+ base_model: meta-llama/CodeLlama-34b-Python-hf
5
+ tags:
6
+ - axolotl
7
+ - generated_from_trainer
8
+ datasets:
9
+ - afrias5/datasetScoreFinal
10
+ model-index:
11
+ - name: meta-codellama-34b-python-Score8192V4
12
+ results: []
13
+ ---
14
+
15
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
+ should probably proofread and complete it, then remove this comment. -->
17
+
18
+ [<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
19
+ <details><summary>See axolotl config</summary>
20
+
21
+ axolotl version: `0.5.3.dev41+g5e9fa33f`
22
+ ```yaml
23
+ base_model: meta-llama/CodeLlama-34b-Python-hf
24
+ model_type: LlamaForCausalLM
25
+ tokenizer_type: CodeLlamaTokenizer
26
+
27
+
28
+ load_in_8bit: false
29
+ load_in_4bit: false
30
+ strict: false
31
+
32
+
33
+ datasets:
34
+ - path: afrias5/datasetScoreFinal
35
+ type: alpaca
36
+ field: text
37
+
38
+
39
+ # dataset_prepared_path: ./FinUpTagsNoTestNoExNew
40
+ val_set_size: 0
41
+ output_dir: models/meta-codellama-34b-python-Score8192V4
42
+ lora_model_dir: models/meta-codellama-34b-python-Score8192V4/checkpoint-55
43
+ auto_resume_from_checkpoints: true
44
+ sequence_len: 8192
45
+ sample_packing: true
46
+ pad_to_sequence_len: true
47
+ eval_sample_packing: False
48
+ adapter: lora
49
+ lora_model_dir:
50
+ lora_r: 2
51
+ lora_alpha: 16
52
+ lora_dropout: 0.05
53
+ lora_target_linear: true
54
+ lora_fan_in_fan_out:
55
+ lora_modules_to_save:
56
+ - embed_tokens
57
+ - lm_head
58
+
59
+
60
+ wandb_project: 'Code34bNewFeed'
61
+ wandb_entity:
62
+ wandb_watch:
63
+ wandb_run_id:
64
+ wandb_name: 'meta-codellama-34b-python-Score8192V4'
65
+ wandb_log_model:
66
+
67
+
68
+ gradient_accumulation_steps: 4
69
+ micro_batch_size: 1
70
+ num_epochs: 14
71
+ optimizer: adamw_torch
72
+ lr_scheduler: cosine
73
+ learning_rate: 0.0002
74
+
75
+
76
+ train_on_inputs: false
77
+ group_by_length: false
78
+ bf16: true
79
+ fp16:
80
+ tf32: false
81
+ hub_model_id: afrias5/meta-codellama-34b-python-Score8192V4
82
+ gradient_checkpointing: true
83
+ early_stopping_patience:
84
+ resume_from_checkpoint:
85
+ local_rank:
86
+ logging_steps: 1
87
+ xformers_attention:
88
+ flash_attention: false
89
+ s2_attention:
90
+ logging_steps: 1
91
+ warmup_steps: 10
92
+ saves_per_epoch: 1
93
+ save_total_limit: 16
94
+ debug:
95
+ deepspeed:
96
+ weight_decay: 0.0
97
+ fsdp:
98
+ deepspeed: deepspeed_configs/zero3_bf16_cpuoffload_all.json
99
+ fsdp_config:
100
+ special_tokens:
101
+ bos_token: "<s>"
102
+ eos_token: "</s>"
103
+ unk_token: "<unk>"
104
+
105
+
106
+
107
+
108
+ ```
109
+
110
+ </details><br>
111
+
112
+ # meta-codellama-34b-python-Score8192V4
113
+
114
+ This model is a fine-tuned version of [meta-llama/CodeLlama-34b-Python-hf](https://huggingface.co/meta-llama/CodeLlama-34b-Python-hf) on the afrias5/datasetScoreFinal dataset.
115
+
116
+ ## Model description
117
+
118
+ More information needed
119
+
120
+ ## Intended uses & limitations
121
+
122
+ More information needed
123
+
124
+ ## Training and evaluation data
125
+
126
+ More information needed
127
+
128
+ ## Training procedure
129
+
130
+ ### Training hyperparameters
131
+
132
+ The following hyperparameters were used during training:
133
+ - learning_rate: 0.0002
134
+ - train_batch_size: 1
135
+ - eval_batch_size: 1
136
+ - seed: 42
137
+ - distributed_type: multi-GPU
138
+ - num_devices: 2
139
+ - gradient_accumulation_steps: 4
140
+ - total_train_batch_size: 8
141
+ - total_eval_batch_size: 2
142
+ - optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
143
+ - lr_scheduler_type: cosine
144
+ - lr_scheduler_warmup_steps: 10
145
+ - num_epochs: 14
146
+
147
+ ### Training results
148
+
149
+
150
+
151
+ ### Framework versions
152
+
153
+ - PEFT 0.14.0
154
+ - Transformers 4.46.3
155
+ - Pytorch 2.5.1+cu121
156
+ - Datasets 3.1.0
157
+ - Tokenizers 0.20.3