fr_childes_13

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.8321

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 13
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 40000
  • training_steps: 100000

Training results

Training Loss Epoch Step Validation Loss
No log 2.5 2000 7.2410
7.2222 5.0 4000 5.9683
7.2222 7.5 6000 5.8241
5.6899 10.0 8000 5.7266
5.6899 12.5 10000 5.6300
5.4663 15.0 12000 5.5524
5.4663 17.5 14000 5.5122
5.3077 20.0 16000 5.4600
5.3077 22.5 18000 5.4015
5.1915 25.0 20000 5.3367
5.1915 27.5 22000 5.1496
4.8173 30.0 24000 4.0808
4.8173 32.5 26000 3.3695
3.387 35.0 28000 3.0722
3.387 37.5 30000 2.8992
2.8035 40.0 32000 2.7279
2.8035 42.5 34000 2.6168
2.531 45.0 36000 2.5570
2.531 47.5 38000 2.4636
2.3359 50.0 40000 2.4193
2.3359 52.5 42000 2.3528
2.1893 55.0 44000 2.2999
2.1893 57.5 46000 2.2412
2.0715 60.0 48000 2.1982
2.0715 62.5 50000 2.1534
1.974 65.0 52000 2.1156
1.974 67.5 54000 2.1153
1.9016 70.0 56000 2.0625
1.9016 72.5 58000 2.0274
1.8403 75.0 60000 2.0366
1.8403 77.5 62000 1.9956
1.7872 80.0 64000 1.9750
1.7872 82.5 66000 1.9521
1.7439 85.0 68000 1.9579
1.7439 87.5 70000 1.9302
1.7084 90.0 72000 1.9384
1.7084 92.5 74000 1.9067
1.6752 95.0 76000 1.9239
1.6752 97.5 78000 1.8918
1.6482 100.0 80000 1.8803
1.6482 102.5 82000 1.8641
1.6209 105.0 84000 1.8514
1.6209 107.5 86000 1.8467
1.5966 110.0 88000 1.8564
1.5966 112.5 90000 1.8653
1.5815 115.0 92000 1.8481
1.5815 117.5 94000 1.8392
1.5695 120.0 96000 1.8283
1.5695 122.5 98000 1.8264
1.5583 125.0 100000 1.8321

Framework versions

  • Transformers 4.45.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.0.1
  • Tokenizers 0.20.1
Downloads last month
0
Safetensors
Model size
14.9M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support