impossible-llms-english-random

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.0462

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
35.913 1.0 95 7.1506
30.1513 2.0 190 5.9945
29.2184 3.0 285 5.8395
28.5279 4.0 380 5.6620
27.8359 5.0 475 5.5429
27.5482 6.0 570 5.4532
27.0829 7.0 665 5.3803
26.7397 8.0 760 5.3227
26.4572 9.0 855 5.2749
26.2057 10.0 950 5.2360
25.9724 11.0 1045 5.2010
25.7457 12.0 1140 5.1755
25.7047 13.0 1235 5.1526
25.5117 14.0 1330 5.1328
25.3094 15.0 1425 5.1168
25.0625 16.0 1520 5.1017
24.9048 17.0 1615 5.0899
25.1186 18.0 1710 5.0804
25.0563 19.0 1805 5.0721
24.8198 20.0 1900 5.0669
24.7689 21.0 1995 5.0611
24.8698 22.0 2090 5.0565
24.5199 23.0 2185 5.0543
24.8015 24.0 2280 5.0501
24.4517 25.0 2375 5.0494
24.5355 26.0 2470 5.0486
24.5157 27.0 2565 5.0473
24.6138 28.0 2660 5.0470
24.4382 29.0 2755 5.0465
24.4547 30.0 2850 5.0463
24.4558 31.0 2945 5.0462
39.0136 31.5812 3000 5.0462

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-english-random