pixel-barec-pretrain

This model is a fine-tuned version of bensapir/pixel-barec-pretrain on the wikipedia + bookcorpus dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6179

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 9.375e-06
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.5
  • training_steps: 200000

Training results

Training Loss Epoch Step Validation Loss
0.8164 11.19 10000 0.7569
0.7702 22.37 20000 0.7498
0.7668 33.56 30000 0.7477
0.7655 44.74 40000 0.7451
0.7653 27.98 50000 0.7479
0.7648 33.58 60000 0.7448
0.7645 39.17 70000 0.7464
0.7642 44.77 80000 0.7450
0.7636 50.36 90000 0.7427
0.7602 55.96 100000 0.7262
0.7279 61.56 110000 0.6972
0.6981 67.15 120000 0.6809
0.6781 72.75 130000 0.6643
0.6612 78.34 140000 0.6534
0.6483 83.94 150000 0.6426
0.6389 89.54 160000 0.6357
0.6318 95.13 170000 0.6320
0.6261 100.73 180000 0.6280
0.6214 106.32 190000 0.6200
0.6177 111.92 200000 0.6200

Framework versions

  • Transformers 4.17.0
  • Pytorch 2.5.1
  • Datasets 2.1.1.dev0
  • Tokenizers 0.21.1
Downloads last month
2
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support