pmahdavi/Llama-3.1-8B-coding-tulu3-ebs128-lr5e6-wsdcr0p4 at 933f998d364711b22a700456168a04e4177cbe13

Llama-3.1-8B-coding-tulu3-ebs128-lr5e6-wsdcr0p4

48.2 GB

1 contributor

History: 21 commits

pmahdavi

Add export_full_state_checkpoint-1100/exp_avg_sq.safetensors - exp_avg_sq.safetensors

933f998 verified 4 months ago

export_full_state_checkpoint-1100
Add export_full_state_checkpoint-1100/exp_avg_sq.safetensors - exp_avg_sq.safetensors 4 months ago
.gitattributes

1.57 kB

Add tokenizer.json 4 months ago
README.md

2.24 kB

Upload run root files (non-recursive) - README.md 4 months ago
all_results.json

360 Bytes

Upload run root files (non-recursive) - all_results.json 4 months ago
config.json

841 Bytes

Add config.json - config.json 4 months ago
eval_results.json

176 Bytes

Upload run root files (non-recursive) - eval_results.json 4 months ago
generation_config.json

180 Bytes

Upload run root files (non-recursive) - generation_config.json 4 months ago
model-00001-of-00004.safetensors

4.98 GB
xet

Add model shard 1/4 4 months ago
model-00002-of-00004.safetensors

5 GB
xet

Add model shard 2/4 4 months ago
model-00003-of-00004.safetensors

4.92 GB
xet

Add model shard 3/4 4 months ago
model-00004-of-00004.safetensors

1.17 GB
xet

Add model shard 4/4 4 months ago
model.safetensors.index.json

24 kB

Upload run root files (non-recursive) - model.safetensors.index.json 4 months ago
special_tokens_map.json

335 Bytes

Upload run root files (non-recursive) - special_tokens_map.json 4 months ago
tokenizer.json

17.2 MB
xet

Add tokenizer.json 4 months ago
tokenizer_config.json

51.2 kB

Upload run root files (non-recursive) - tokenizer_config.json 4 months ago
train_results.json

219 Bytes

Upload run root files (non-recursive) - train_results.json 4 months ago
trainer_log.jsonl

22.7 kB

Upload run root files (non-recursive) - trainer_log.jsonl 4 months ago
training_args.bin
Detected Pickle imports (14)
- "llamafactory.hparams.training_args.TrainingArguments",
- "accelerate.state.PartialState",
- "transformers.integrations.deepspeed.HfTrainerDeepSpeedConfig",
- "transformers.trainer_utils.IntervalStrategy",
- "transformers.trainer_utils.SchedulerType",
- "torch.bfloat16",
- "transformers.trainer_utils.HubStrategy",
- "transformers.trainer_utils.SaveStrategy",
- "accelerate.utils.dataclasses.DeepSpeedPlugin",
- "transformers.training_args.OptimizerNames",
- "transformers.integrations.deepspeed.HfDeepSpeedConfig",
- "torch.device",
- "transformers.trainer_pt_utils.AcceleratorConfig",
- "accelerate.utils.dataclasses.DistributedType"
How to fix it?
7.8 kB
xet

Upload run root files (non-recursive) - training_args.bin 4 months ago
training_config.yaml

856 Bytes

Upload run root files (non-recursive) - training_config.yaml 4 months ago
training_eval_loss.png

41.5 kB

Upload run root files (non-recursive) - training_eval_loss.png 4 months ago
training_loss.png

58.4 kB

Upload run root files (non-recursive) - training_loss.png 4 months ago

Detected Pickle imports (14)