shuff_100

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 322.2777
  • Wer: 0.6311
  • Cer: 0.2590

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0005
  • train_batch_size: 16
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.15
  • num_epochs: 60.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer Cer
2829.8126 1.0 3265 346.9778 0.7165 0.2613
761.7557 2.0 6530 242.2776 0.5436 0.2114
672.5873 3.0 9795 339.8657 0.5724 0.2624
668.1343 4.0 13060 278.7672 0.5458 0.2289
674.2274 5.0 16325 305.6685 0.6026 0.2488
707.3097 6.0 19590 285.1494 0.5852 0.2355
741.8663 7.0 22855 350.9450 0.6435 0.2659
764.1505 8.0 26120 320.6730 0.6421 0.2611
814.4658 9.0 29385 319.7635 0.6572 0.2670
842.7828 10.0 32650 375.5847 0.6620 0.2833
827.2463 11.0 35915 329.4790 0.6426 0.2613
812.5209 12.0 39180 322.2777 0.6311 0.2590

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1
  • Datasets 3.2.0
  • Tokenizers 0.21.0
Downloads last month
7
Safetensors
Model size
94.4M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support