Distill Whisper Call Center NER 1000

This model is a fine-tuned version of lelapa/distill_whisper_call_center_en_merged on the lelapa/Names_Accents dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1828
  • Wer: 13.1183

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 256
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 125
  • training_steps: 1000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Wer
1.3331 2.9412 100 0.4857 39.1828
0.1568 5.8824 200 0.2184 17.1613
0.0243 8.8235 300 0.1941 15.2258
0.0059 11.7647 400 0.1830 13.9355
0.0019 14.7059 500 0.1823 13.8065
0.001 17.6471 600 0.1819 13.3763
0.0008 20.5882 700 0.1823 13.2043
0.0007 23.5294 800 0.1825 13.2043
0.0006 26.4706 900 0.1828 13.2043
0.0006 29.4118 1000 0.1828 13.1183

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0
  • Tokenizers 0.20.3
Downloads last month
-
Safetensors
Model size
756M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Luandrie/_Whisper_Call_Center_NamesAdded_1000

Finetuned
(6)
this model

Evaluation results