shin00001's picture
Upload 9 files
e0444cd verified
|
raw
history blame
2.52 kB
metadata
license: llama3
library_name: peft
tags:
  - trl
  - sft
  - generated_from_trainer
base_model: meta-llama/Meta-Llama-3-8B-Instruct
model-index:
  - name: experiments
    results: []

experiments

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 3.1279

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.05
  • num_epochs: 70
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.3621 3.4964 420 1.4379
1.1954 6.9927 840 1.4208
0.8507 10.4891 1260 1.5712
0.789 13.9854 1680 1.6759
0.5388 17.4818 2100 1.9153
0.4013 20.9781 2520 2.0319
0.2933 24.4745 2940 2.2094
0.207 27.9709 3360 2.3547
0.1604 31.4672 3780 2.5483
0.1154 34.9636 4200 2.5953
0.0982 38.4599 4620 2.7355
0.0954 41.9563 5040 2.8220
0.0677 45.4527 5460 2.8909
0.0613 48.9490 5880 2.9654
0.0482 52.4454 6300 3.0125
0.0415 55.9417 6720 3.0390
0.0477 59.4381 7140 3.0992
0.0412 62.9344 7560 3.1126
0.0327 66.4308 7980 3.1262
0.0391 69.9272 8400 3.1279

Framework versions

  • PEFT 0.11.1
  • Transformers 4.41.2
  • Pytorch 1.13.1+cu117
  • Datasets 2.19.2
  • Tokenizers 0.19.1