trainer

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3546
  • Accuracy: 0.8807

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.032227
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 4096
  • optimizer: Use OptimizerNames.SCHEDULE_FREE_ADAMW with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: constant
  • lr_scheduler_warmup_steps: 1000
  • training_steps: 1000000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 0 0 4.3903 0.0137
No log 0.0044 122 1.1251 0.6574
No log 0.0087 244 0.8266 0.7365
No log 0.0131 366 0.7493 0.7590
No log 0.0175 488 0.6913 0.7755
9.1782 0.0218 610 0.6348 0.7927
9.1782 0.0262 732 0.5897 0.8064
9.1782 0.0306 854 0.5569 0.8170
9.1782 0.0349 976 0.5262 0.8266
5.0917 0.0393 1098 0.4957 0.8360
5.0917 0.0437 1220 0.4761 0.8424
5.0917 0.0480 1342 0.4616 0.8464
5.0917 0.0524 1464 0.4479 0.8510
4.0398 0.0568 1586 0.4397 0.8536
4.0398 0.0611 1708 0.4293 0.8564
4.0398 0.0655 1830 0.4231 0.8592
4.0398 0.0699 1952 0.4139 0.8614
3.5268 0.0743 2074 0.4088 0.8635
3.5268 0.0786 2196 0.4035 0.8649
3.5268 0.0830 2318 0.4000 0.8666
3.5268 0.0874 2440 0.3950 0.8678
3.3084 0.0917 2562 0.3915 0.8688
3.3084 0.0961 2684 0.3866 0.8705
3.3084 0.1005 2806 0.3843 0.8712
3.3084 0.1048 2928 0.3804 0.8726
3.1769 0.1092 3050 0.3776 0.8733
3.1769 0.1136 3172 0.3729 0.8749
3.1769 0.1179 3294 0.3723 0.8751
3.1769 0.1223 3416 0.3698 0.8759
3.0785 0.1267 3538 0.3659 0.8772
3.0785 0.1310 3660 0.3644 0.8775
3.0785 0.1354 3782 0.3599 0.8788
3.0785 0.1398 3904 0.3584 0.8794
2.9831 0.1441 4026 0.3567 0.8800
2.9831 0.1485 4148 0.3528 0.8817
2.9831 0.1529 4270 0.3535 0.8811
2.9831 0.1572 4392 0.3541 0.8809

Framework versions

  • Transformers 4.52.2
  • Pytorch 2.8.0.dev20250521+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
21
Safetensors
Model size
4.55M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Evaluation results