Se124M100KInfPrompt_WT_EOS_medium

This model is a fine-tuned version of gpt2-medium on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7127

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 256
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.03
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
2.8652 0.0655 20 2.6742
2.6735 0.1309 40 2.4205
2.3498 0.1964 60 2.0554
1.9542 0.2619 80 1.6239
1.5661 0.3273 100 1.2791
1.3052 0.3928 120 1.0776
1.1291 0.4583 140 0.9537
1.0151 0.5237 160 0.8837
0.9431 0.5892 180 0.8324
0.8821 0.6547 200 0.8044
0.8536 0.7201 220 0.7846
0.8371 0.7856 240 0.7712
0.8281 0.8511 260 0.7628
0.8077 0.9165 280 0.7553
0.8013 0.9820 300 0.7501
0.7948 1.0458 320 0.7447
0.783 1.1113 340 0.7394
0.7727 1.1768 360 0.7372
0.777 1.2422 380 0.7331
0.7711 1.3077 400 0.7309
0.7642 1.3732 420 0.7289
0.7631 1.4386 440 0.7267
0.7581 1.5041 460 0.7250
0.7606 1.5696 480 0.7233
0.7578 1.6350 500 0.7223
0.7562 1.7005 520 0.7208
0.7497 1.7660 540 0.7195
0.7508 1.8314 560 0.7179
0.7476 1.8969 580 0.7168
0.7503 1.9624 600 0.7165
0.7414 2.0262 620 0.7164
0.7425 2.0917 640 0.7159
0.7451 2.1571 660 0.7146
0.7452 2.2226 680 0.7147
0.7446 2.2881 700 0.7138
0.7437 2.3535 720 0.7140
0.7397 2.4190 740 0.7131
0.7426 2.4845 760 0.7130
0.7421 2.5499 780 0.7127
0.7408 2.6154 800 0.7135
0.7413 2.6809 820 0.7135
0.7404 2.7463 840 0.7131
0.7373 2.8118 860 0.7128
0.7451 2.8773 880 0.7134
0.7407 2.9427 900 0.7127

Framework versions

  • PEFT 0.15.1
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu118
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for augustocsc/Se124M100KInfPrompt_EOS

Adapter
(161)
this model