pixel-barec-pretrain

This model is a fine-tuned version of bensapir/pixel-barec-pretrain on the wikipedia + bookcorpus dataset. It achieves the following results on the evaluation set:

Loss: 0.6179

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 9.375e-06
train_batch_size: 16
eval_batch_size: 8
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: cosine
lr_scheduler_warmup_ratio: 0.5
training_steps: 200000

Training results

Training Loss	Epoch	Step	Validation Loss
0.8164	11.19	10000	0.7569
0.7702	22.37	20000	0.7498
0.7668	33.56	30000	0.7477
0.7655	44.74	40000	0.7451
0.7653	27.98	50000	0.7479
0.7648	33.58	60000	0.7448
0.7645	39.17	70000	0.7464
0.7642	44.77	80000	0.7450
0.7636	50.36	90000	0.7427
0.7602	55.96	100000	0.7262
0.7279	61.56	110000	0.6972
0.6981	67.15	120000	0.6809
0.6781	72.75	130000	0.6643
0.6612	78.34	140000	0.6534
0.6483	83.94	150000	0.6426
0.6389	89.54	160000	0.6357
0.6318	95.13	170000	0.6320
0.6261	100.73	180000	0.6280
0.6214	106.32	190000	0.6200
0.6177	111.92	200000	0.6200

Framework versions

Transformers 4.17.0
Pytorch 2.5.1
Datasets 2.1.1.dev0
Tokenizers 0.21.1

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Evaluation results

Metadata error: specify a dataset to view leaderboard