pixel-barec-pretrain
This model is a fine-tuned version of bensapir/pixel-barec-pretrain on the wikipedia + bookcorpus dataset. It achieves the following results on the evaluation set:
- Loss: 0.6179
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 9.375e-06
- train_batch_size: 16
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.5
- training_steps: 200000
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
0.8164 | 11.19 | 10000 | 0.7569 |
0.7702 | 22.37 | 20000 | 0.7498 |
0.7668 | 33.56 | 30000 | 0.7477 |
0.7655 | 44.74 | 40000 | 0.7451 |
0.7653 | 27.98 | 50000 | 0.7479 |
0.7648 | 33.58 | 60000 | 0.7448 |
0.7645 | 39.17 | 70000 | 0.7464 |
0.7642 | 44.77 | 80000 | 0.7450 |
0.7636 | 50.36 | 90000 | 0.7427 |
0.7602 | 55.96 | 100000 | 0.7262 |
0.7279 | 61.56 | 110000 | 0.6972 |
0.6981 | 67.15 | 120000 | 0.6809 |
0.6781 | 72.75 | 130000 | 0.6643 |
0.6612 | 78.34 | 140000 | 0.6534 |
0.6483 | 83.94 | 150000 | 0.6426 |
0.6389 | 89.54 | 160000 | 0.6357 |
0.6318 | 95.13 | 170000 | 0.6320 |
0.6261 | 100.73 | 180000 | 0.6280 |
0.6214 | 106.32 | 190000 | 0.6200 |
0.6177 | 111.92 | 200000 | 0.6200 |
Framework versions
- Transformers 4.17.0
- Pytorch 2.5.1
- Datasets 2.1.1.dev0
- Tokenizers 0.21.1
- Downloads last month
- 2
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support