Llama3-OpenBioLLM-8B-PsyCourse-fold6

This model is a fine-tuned version of aaditya/Llama3-OpenBioLLM-8B on the course-train-fold6 dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0381

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 16
  • total_train_batch_size: 16
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 5.0

Training results

Training Loss Epoch Step Validation Loss
0.4971 0.0751 50 0.3237
0.1179 0.1502 100 0.0942
0.0891 0.2254 150 0.0797
0.0539 0.3005 200 0.0598
0.065 0.3756 250 0.0575
0.052 0.4507 300 0.0530
0.0535 0.5258 350 0.0545
0.0512 0.6009 400 0.0466
0.0597 0.6761 450 0.0495
0.0509 0.7512 500 0.0469
0.0624 0.8263 550 0.0431
0.035 0.9014 600 0.0458
0.0546 0.9765 650 0.0444
0.0404 1.0516 700 0.0443
0.0342 1.1268 750 0.0434
0.0298 1.2019 800 0.0428
0.0348 1.2770 850 0.0407
0.0291 1.3521 900 0.0397
0.0373 1.4272 950 0.0397
0.0289 1.5023 1000 0.0406
0.0339 1.5775 1050 0.0450
0.0244 1.6526 1100 0.0412
0.0288 1.7277 1150 0.0402
0.036 1.8028 1200 0.0395
0.0334 1.8779 1250 0.0392
0.0568 1.9531 1300 0.0428
0.0206 2.0282 1350 0.0409
0.0226 2.1033 1400 0.0408
0.0257 2.1784 1450 0.0407
0.0138 2.2535 1500 0.0414
0.0167 2.3286 1550 0.0409
0.0217 2.4038 1600 0.0388
0.0195 2.4789 1650 0.0427
0.0223 2.5540 1700 0.0437
0.0195 2.6291 1750 0.0428
0.0218 2.7042 1800 0.0409
0.0189 2.7793 1850 0.0401
0.0195 2.8545 1900 0.0381
0.0188 2.9296 1950 0.0399
0.0105 3.0047 2000 0.0412
0.0073 3.0798 2050 0.0443
0.0082 3.1549 2100 0.0466
0.0107 3.2300 2150 0.0490
0.0143 3.3052 2200 0.0447
0.005 3.3803 2250 0.0471
0.0079 3.4554 2300 0.0481
0.0085 3.5305 2350 0.0501
0.0073 3.6056 2400 0.0468
0.0034 3.6808 2450 0.0481
0.0053 3.7559 2500 0.0496
0.0052 3.8310 2550 0.0498
0.0117 3.9061 2600 0.0501
0.0082 3.9812 2650 0.0500
0.0054 4.0563 2700 0.0516
0.0019 4.1315 2750 0.0546
0.0048 4.2066 2800 0.0565
0.0026 4.2817 2850 0.0583
0.001 4.3568 2900 0.0608
0.0022 4.4319 2950 0.0609
0.0036 4.5070 3000 0.0615
0.0006 4.5822 3050 0.0621
0.0019 4.6573 3100 0.0624
0.0027 4.7324 3150 0.0628
0.0022 4.8075 3200 0.0629
0.002 4.8826 3250 0.0630
0.0063 4.9577 3300 0.0628

Framework versions

  • PEFT 0.12.0
  • Transformers 4.46.1
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
9
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for chchen/Llama3-OpenBioLLM-8B-PsyCourse-fold6

Adapter
(45)
this model