train_record_1753094160

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the record dataset. It achieves the following results on the evaluation set:

  • Loss: 0.2664
  • Num Input Tokens Seen: 464483424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.2352 0.5 15621 0.3098 23227520
0.2159 1.0 31242 0.2921 46454112
0.2584 1.5 46863 0.2711 69694624
0.179 2.0 62484 0.2664 92908288
0.1124 2.5 78105 0.2872 116099296
0.1993 3.0 93726 0.2908 139351808
0.1558 3.5 109347 0.3250 162566976
0.1782 4.0 124968 0.3064 185790304
0.1187 4.5 140589 0.3423 208997696
0.0991 5.0 156210 0.3587 232243968
0.1312 5.5 171831 0.4016 255458112
0.2085 6.0 187452 0.3625 278686752
0.1709 6.5 203073 0.4192 301925344
0.1077 7.0 218694 0.4075 325137568
0.1939 7.5 234315 0.4813 348361920
0.0795 8.0 249936 0.4618 371592704
0.1507 8.5 265557 0.5457 394838368
0.0835 9.0 281178 0.5146 418033696
0.1975 9.5 296799 0.6221 441282560
0.1145 10.0 312420 0.6266 464483424

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
18
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_record_1753094160

Adapter
(971)
this model