winglian commited on Mar 14

Commit

c86d316

verified ·

1 Parent(s): 9452454

Add files using upload-large-folder tool

Browse files

Files changed (33) hide show

README.md +153 -0
axolotl/stratos.yaml +83 -0
model-00001-of-00030.safetensors +1 -1
model-00002-of-00030.safetensors +1 -1
model-00003-of-00030.safetensors +1 -1
model-00004-of-00030.safetensors +1 -1
model-00005-of-00030.safetensors +1 -1
model-00006-of-00030.safetensors +1 -1
model-00007-of-00030.safetensors +1 -1
model-00008-of-00030.safetensors +1 -1
model-00009-of-00030.safetensors +1 -1
model-00010-of-00030.safetensors +1 -1
model-00011-of-00030.safetensors +1 -1
model-00012-of-00030.safetensors +1 -1
model-00013-of-00030.safetensors +1 -1
model-00014-of-00030.safetensors +1 -1
model-00015-of-00030.safetensors +1 -1
model-00016-of-00030.safetensors +1 -1
model-00017-of-00030.safetensors +1 -1
model-00018-of-00030.safetensors +1 -1
model-00019-of-00030.safetensors +1 -1
model-00020-of-00030.safetensors +1 -1
model-00021-of-00030.safetensors +1 -1
model-00022-of-00030.safetensors +1 -1
model-00023-of-00030.safetensors +1 -1
model-00024-of-00030.safetensors +1 -1
model-00025-of-00030.safetensors +1 -1
model-00026-of-00030.safetensors +1 -1
model-00027-of-00030.safetensors +1 -1
model-00028-of-00030.safetensors +1 -1
model-00029-of-00030.safetensors +1 -1
model-00030-of-00030.safetensors +1 -1
training_args.bin +3 -0

README.md ADDED Viewed

	@@ -0,0 +1,153 @@

+---
+library_name: transformers
+license: llama3.1
+base_model: meta-llama/Llama-3.1-70B
+tags:
+- generated_from_trainer
+datasets:
+- bespokelabs/Bespoke-Stratos-17k
+model-index:
+- name: outputs/out/reasoning-70b-stratos
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+[<img src="https://raw.githubusercontent.com/axolotl-ai-cloud/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/axolotl-ai-cloud/axolotl)
+<details><summary>See axolotl config</summary>
+axolotl version: `0.8.0.dev0`
+```yaml
+base_model: meta-llama/Llama-3.1-70B
+# Automatically upload checkpoint and final model to HF
+# hub_model_id: username/custom_model_name
+#
+plugins:
+  - axolotl.integrations.liger.LigerPlugin
+  - axolotl.integrations.spectrum.SpectrumPlugin
+spectrum_top_fraction: 0.5
+spectrum_model_name: meta-llama/Meta-Llama-3.1-70B
+liger_rope: true
+liger_rms_norm: true
+liger_glu_activation: true
+liger_fused_linear_cross_entropy: true
+strict: false
+chat_template: llama3
+datasets:
+  - path: bespokelabs/Bespoke-Stratos-17k
+    field_messages: conversations
+    message_property_mappings:
+      content: value
+      role: from
+    split: train
+    type: chat_template
+dataset_prepared_path: last_run_prepared
+val_set_size: 0.0
+output_dir: ./outputs/out/reasoning-70b-stratos
+save_safetensors: true
+wandb_project: reasoning-70b-stratos
+wandb_entity: axolotl-ai
+wandb_watch:
+wandb_name:
+wandb_log_model:
+sequence_len: 16384
+sample_packing: true
+pad_to_sequence_len: true
+gradient_accumulation_steps: 1
+micro_batch_size: 4
+num_epochs: 3
+optimizer: adamw_torch_fused
+lr_scheduler: rex
+learning_rate: 5.0e-6
+embedding_lr: 1.0e-5
+max_grad_norm: 1.0
+train_on_inputs: false
+group_by_length: false
+bf16: true
+tf32: true
+gradient_checkpointing: offload
+gradient_checkpointing_kwargs:
+  use_reentrant: true
+logging_steps: 1
+flash_attention: true
+warmup_steps: 20
+evals_per_epoch: 4
+saves_per_epoch: 1
+weight_decay: 0.01
+deepspeed: deepspeed_configs/zero3_bf16_cpuoffload_params.json
+special_tokens:
+  pad_token: <|finetune_right_pad_id|>
+  eos_token: <|eot_id|>
+added_tokens_overrides:
+  128011: <think>
+  128012: </think>
+  128013: <|begin_of_thought|>
+  128014: <|end_of_thought|>
+  128015: <|begin_of_solution|>
+  128016: <|end_of_solution|>
+fix_untrained_tokens:
+  - 128011
+  - 128012
+  - 128013
+  - 128014
+  - 128015
+  - 128016
+```
+</details><br>
+# outputs/out/reasoning-70b-stratos
+This model is a fine-tuned version of [meta-llama/Llama-3.1-70B](https://huggingface.co/meta-llama/Llama-3.1-70B) on the bespokelabs/Bespoke-Stratos-17k dataset.
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 4
+- eval_batch_size: 4
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 16
+- total_train_batch_size: 64
+- total_eval_batch_size: 64
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: rex
+- lr_scheduler_warmup_steps: 20
+- num_epochs: 3.0
+### Training results
+### Framework versions
+- Transformers 4.49.0
+- Pytorch 2.5.1+cu124
+- Datasets 3.2.0
+- Tokenizers 0.21.0

axolotl/stratos.yaml ADDED Viewed

	@@ -0,0 +1,83 @@

+base_model: meta-llama/Llama-3.1-70B
+# Automatically upload checkpoint and final model to HF
+# hub_model_id: username/custom_model_name
+#
+plugins:
+  - axolotl.integrations.liger.LigerPlugin
+  - axolotl.integrations.spectrum.SpectrumPlugin
+spectrum_top_fraction: 0.5
+spectrum_model_name: meta-llama/Meta-Llama-3.1-70B
+liger_rope: true
+liger_rms_norm: true
+liger_glu_activation: true
+liger_fused_linear_cross_entropy: true
+strict: false
+chat_template: llama3
+datasets:
+  - path: bespokelabs/Bespoke-Stratos-17k
+    field_messages: conversations
+    message_property_mappings:
+      content: value
+      role: from
+    split: train
+    type: chat_template
+dataset_prepared_path: last_run_prepared
+val_set_size: 0.0
+output_dir: ./outputs/out/reasoning-70b-stratos
+save_safetensors: true
+wandb_project: reasoning-70b-stratos
+wandb_entity: axolotl-ai
+wandb_watch:
+wandb_name:
+wandb_log_model:
+sequence_len: 16384
+sample_packing: true
+pad_to_sequence_len: true
+gradient_accumulation_steps: 1
+micro_batch_size: 4
+num_epochs: 3
+optimizer: adamw_torch_fused
+lr_scheduler: rex
+learning_rate: 5.0e-6
+embedding_lr: 1.0e-5
+max_grad_norm: 1.0
+train_on_inputs: false
+group_by_length: false
+bf16: true
+tf32: true
+gradient_checkpointing: offload
+gradient_checkpointing_kwargs:
+  use_reentrant: true
+logging_steps: 1
+flash_attention: true
+warmup_steps: 20
+evals_per_epoch: 4
+saves_per_epoch: 1
+weight_decay: 0.01
+deepspeed: deepspeed_configs/zero3_bf16_cpuoffload_params.json
+special_tokens:
+  pad_token: <|finetune_right_pad_id|>
+  eos_token: <|eot_id|>
+added_tokens_overrides:
+  128011: <think>
+  128012: </think>
+  128013: <|begin_of_thought|>
+  128014: <|end_of_thought|>
+  128015: <|begin_of_solution|>
+  128016: <|end_of_solution|>
+fix_untrained_tokens:
+  - 128011
+  - 128012
+  - 128013
+  - 128014
+  - 128015
+  - 128016

model-00001-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d4db8f40a4ce09727d73149688e86ed3cce858dbd02fb0303fe63e3f09c5e3dc
 size 4584408808

 version https://git-lfs.github.com/spec/v1
+oid sha256:7ca80e1a754978eb9a37681a4f56a6cd4e1dd0daf1ecac0217e4e5974547df4c
 size 4584408808

model-00002-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75f1f5b729e4e0aaa8cd3c7dccb149b39c73a5432d5da4826593d722c4faad59
 size 4664167376

 version https://git-lfs.github.com/spec/v1
+oid sha256:1fea0db1a47fefcde9cb301d901060a6f54dd8fc0a64923f0bd906a6a2964a3c
 size 4664167376

model-00003-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2b5ee71bf64cb531801175114296152fb46442c35c3b27df8821a35bf9246108
 size 4999711704

 version https://git-lfs.github.com/spec/v1
+oid sha256:f0ead33ec88995dd2c9a52ec6687c88bc4f2f12341534e08bdba6205c3968170
 size 4999711704

model-00004-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:57f93a00320a8ccfc4ff3047b482665798e76a1aef272d491c02610e668df603
 size 4966157032

 version https://git-lfs.github.com/spec/v1
+oid sha256:90d0dbe4c6e6f84c0fd63848ae81caf543a2c6dca47e331181988b64b02b1b99
 size 4966157032

model-00005-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2d2e63eda4ecf0d4b10d504cf61684dcdb75b8f5b026ebaca70750801e1f43f8
 size 4664134408

 version https://git-lfs.github.com/spec/v1
+oid sha256:dc06f5d1c5a6172c02f302e542d606999102ad7d0443a5767be283f5254daf8d
 size 4664134408

model-00006-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2dc85746c8b1bcd87c14b16f18370f051a8558074a2ce35f4cf1f2570ca89027
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:5a284b9fb811ea933d4d12cf68d7f148887c2b75616e88f86a4037c35992d698
 size 4664167408

model-00007-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:60869a2d57bec1ca347e0084d318f5c786ef64af1e3b9fe33c83b717a84c86c2
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:315217971749a21d8bf1e29c1a35465ebd79cb83a551aa2ae22bf610d7914e2b
 size 4664167408

model-00008-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f1c27115ecbf1acd4331772cfa12d65cf9e9cbe294736dfc0ba439adca787d82
 size 4999711728

 version https://git-lfs.github.com/spec/v1
+oid sha256:ece7f3b69386b701c284fb94e8f1449836c970b824ba6dca9dae218c532d289e
 size 4999711728

model-00009-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:99cf6a25c42eb54ef95c7a0de05e45c51ad69f80991345ecd00c30ad790e4065
 size 4966157056

 version https://git-lfs.github.com/spec/v1
+oid sha256:f61de6c416a8fa6f3997320a18c40fd7771bd490bb1d5cf8e4e50801ff92c083
 size 4966157056

model-00010-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:cfefebdd7077b2bb51a803e29d87c13b3edf055931fa464da5e99c9d79d96d07
 size 4664134408

 version https://git-lfs.github.com/spec/v1
+oid sha256:a79e72a7989ec996f544e3df9e29f472c74bd59dcc58f17abb5dbc9328e8be6d
 size 4664134408

model-00011-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:52e6eacc03cf033535f5202fe9f967147cee214792b7ff355ce17712dd9fbede
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:9b8a766a649ead5ba9c118ee25dd12c4fcdb96d15d04283a8161160c84f8f0e7
 size 4664167408

model-00012-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:75cee956056a25b830ede992feb638919f4567e517f537d5ca77485a8424d7e5
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:254cd55d1512b2dc1969492a9029978bb3262ea4e373e87130c93a286f070d82
 size 4664167408

model-00013-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:8105bc43e195dbac5a6bd9fcadbfda375f61fb31d331b6abfa6c960398703b93
 size 4999711728

 version https://git-lfs.github.com/spec/v1
+oid sha256:085a29f24ba4379f3fddcb799342ab6f8f5770c17d150ff477b99b5289034743
 size 4999711728

model-00014-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:18da2a6db362c05fc0fab4d49662e7c223a75ef54e4f3fb8a9f8b5384fb8f8e3
 size 4966157056

 version https://git-lfs.github.com/spec/v1
+oid sha256:8e19e9c5fcd524342acdfdb5ed05de0e5c7f62e9209324b5856df29aa376019f
 size 4966157056

model-00015-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:0c6e54f2f97562d5f7a8a21d5c0fe95bdcf7ac30c5dc4cc667c243ab40e92a35
 size 4664134408

 version https://git-lfs.github.com/spec/v1
+oid sha256:141bc4a331009edfb197b808644a6aaba06c1ac0ecd088235b63469edc224291
 size 4664134408

model-00016-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a5f8f9e2dfad75c1806dd8054ff50a367edd1cc31baad4d87cd498a19bb328d7
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:4198a40ca7ea153774800d9a55593d05d77a3b0768097090376c77b3406b03eb
 size 4664167408

model-00017-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1813480b64571208e16a975ba3a817de5711a073732ef2d60ec3fb4abaf9d187
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:059331741bdf055b6774796222b86e295f504074807093614f163e38bf26f209
 size 4664167408

model-00018-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:f7bff86d5f01505d7e5cff14b6051faea006cdd3c0e7e598e4a6f4b6ce0de6c1
 size 4999711728

 version https://git-lfs.github.com/spec/v1
+oid sha256:edf8c97b86f3e23df8879b95b684fa8b34f44475060d41d442e762b5aa6c2657
 size 4999711728

model-00019-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:626c689ad0113548d287cf181d1fd9f51bdd35d04bc7f7dcd96f0ef741fe899d
 size 4966157056

 version https://git-lfs.github.com/spec/v1
+oid sha256:67bf7d65922c77f190b5833a85355de7358e4aed65b9c46ac7c57cd3fe6c7b21
 size 4966157056

model-00020-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5102e63eceda0c66a9a333489c542d66bdc7941afadaa40ba76f8282ba57f46d
 size 4664134408

 version https://git-lfs.github.com/spec/v1
+oid sha256:66d3264812cbe8c4693c96a0414e9ffa7904cefbd15e6b41966ff6f2d6bf9c6c
 size 4664134408

model-00021-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:2ee927b162600086c55de61090302c53f79e943a9849871b42de73406137592c
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:be1bf6ca9dcc681dd25edcafd06e775b4da447e1278ed2465e0282dd122aa84b
 size 4664167408

model-00022-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:ffa571b3890158ee201674f1937883934b1798a093d0cb3371b403d58b0166b9
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:19bc99461172d38bad0cd16094642cdb02db91efd9ba9c4fb453afc7bb006932
 size 4664167408

model-00023-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:01639e2f1ab9117bab357e157e2f4d012252c572d2cd6ab2756b9978466e91fa
 size 4999711728

 version https://git-lfs.github.com/spec/v1
+oid sha256:0d3eba96aaf2ea0404a4fbc6c3b8f49da08b0990100680c222676c3ee65e5beb
 size 4999711728

model-00024-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b12dfce2f5c6c48d1a5e7fa3fe34e52b67541662f3aa97523b23873adb0c1d08
 size 4966157056

 version https://git-lfs.github.com/spec/v1
+oid sha256:7ed2ebf355638c0c2b4935bc14dda01c04d0adaa6529a02975592829ad6285b1
 size 4966157056

model-00025-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:80b3ba6ed3f327b5108c183fbeca7271ad67b0cba5e5b91af634e62dd95a6bc1
 size 4664134408

 version https://git-lfs.github.com/spec/v1
+oid sha256:dc0edc2579a2a778be76dfc1909d8e866a04feea25b0719f0552ba9b12005c51
 size 4664134408

model-00026-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:a9951073773183da6e3195d9074094ca7ce33b2c1ad7e0db6d38c6383d4d3310
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:25d2fe21ccca26a214df47e3f87c1c1d19a62d6ebe09a4eb2013fed6a4b18fb2
 size 4664167408

model-00027-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:882e444277d54574cdac4eb7ecbade9f0c7ad80dc65087bdd2de0766ee358bad
 size 4664167408

 version https://git-lfs.github.com/spec/v1
+oid sha256:3557b153f9f0a17f07bf2cc051ffbf0991983ddcbc85e73fb9ed095a585ecd59
 size 4664167408

model-00028-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d51c219c82d2ff8085cf27a8a99be72bff3f2b6287b3eaac98532b1f6f67f906
 size 4999711728

 version https://git-lfs.github.com/spec/v1
+oid sha256:23e0e49f71bce1c50286d9921d12b2bcc3123a3f369933449da9f5493c8a8cc4
 size 4999711728

model-00029-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:5a18f01c95dc02ac63a8a9e6a2f4996d327a4a6085d8d1caa1b3a59f51625db7
 size 4966173536

 version https://git-lfs.github.com/spec/v1
+oid sha256:b05d091c7b850cb3ad201613b048cb667eedf35970c70b16f0d2d866c49eec08
 size 4966173536

model-00030-of-00030.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:275b0bd384f1a37f4653cf3e61d3879dd3731cbd8e079568ea9aa88bc295ebed
 size 2101346432

 version https://git-lfs.github.com/spec/v1
+oid sha256:66e448c41a9cf0e8da3e10ab0af40ea28951f00f74e6272ffd2efed162ba1124
 size 2101346432

training_args.bin ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:0ac9bb2315889e2fba1ad65048b7750130f6a0625e3a8fba402553337fc9ca8c
+size 9144