train_copa_1753094177

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1169
  • Num Input Tokens Seen: 281856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.6395 0.5 45 0.6500 14016
0.7006 1.0 90 0.4485 28096
0.1984 1.5 135 0.1417 42144
0.1022 2.0 180 0.1372 56128
0.0941 2.5 225 0.1288 70272
0.0492 3.0 270 0.1268 84352
0.1109 3.5 315 0.1257 98464
0.0217 4.0 360 0.1230 112576
0.1727 4.5 405 0.1221 126624
0.0202 5.0 450 0.1202 140832
0.052 5.5 495 0.1207 154976
0.0191 6.0 540 0.1234 169056
0.1793 6.5 585 0.1185 183200
0.0722 7.0 630 0.1177 197344
0.0602 7.5 675 0.1186 211392
0.0334 8.0 720 0.1204 225536
0.0266 8.5 765 0.1173 239680
0.046 9.0 810 0.1169 253696
0.0042 9.5 855 0.1186 267840
0.0783 10.0 900 0.1192 281856

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
22
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1753094177

Adapter
(971)
this model