train_copa_1753094179

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the copa dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1246
  • Num Input Tokens Seen: 281856

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 123
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 10.0

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.1596 0.5 45 0.1958 14016
0.2404 1.0 90 0.1377 28096
0.1623 1.5 135 0.1277 42144
0.1 2.0 180 0.1283 56128
0.0681 2.5 225 0.1267 70272
0.0289 3.0 270 0.1246 84352
0.0638 3.5 315 0.1314 98464
0.0061 4.0 360 0.1305 112576
0.1354 4.5 405 0.1356 126624
0.0018 5.0 450 0.1401 140832
0.0111 5.5 495 0.1353 154976
0.0039 6.0 540 0.1413 169056
0.1049 6.5 585 0.1374 183200
0.0106 7.0 630 0.1402 197344
0.018 7.5 675 0.1404 211392
0.0021 8.0 720 0.1440 225536
0.0019 8.5 765 0.1409 239680
0.0019 9.0 810 0.1421 253696
0.0005 9.5 855 0.1449 267840
0.0235 10.0 900 0.1471 281856

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.7.1+cu126
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
19
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_copa_1753094179

Adapter
(971)
this model