train_openbookqa_1754652174

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the openbookqa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6985
  • Num Input Tokens Seen: 4204168

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
4.7201 0.5 558 4.3558 210048
0.8136 1.0 1116 0.8080 420520
0.7929 1.5 1674 0.7312 630888
0.7296 2.0 2232 0.7163 841024
0.8036 2.5 2790 0.7250 1051168
0.7289 3.0 3348 0.7142 1261304
0.698 3.5 3906 0.7066 1472152
0.7177 4.0 4464 0.7047 1682016
0.6778 4.5 5022 0.7019 1892160
0.7394 5.0 5580 0.7092 2102920
0.6996 5.5 6138 0.7000 2311976
0.7012 6.0 6696 0.7014 2523672
0.6863 6.5 7254 0.7022 2732440
0.6765 7.0 7812 0.7004 2943688
0.7108 7.5 8370 0.7001 3153640
0.7097 8.0 8928 0.6994 3363864
0.6701 8.5 9486 0.6988 3574616
0.7297 9.0 10044 0.6996 3783840
0.6966 9.5 10602 0.6985 3994976
0.6937 10.0 11160 0.7000 4204168

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_openbookqa_1754652174

Adapter
(2016)
this model