Al-Atlas-LLM-0.5B-bs-4-lr-5e-05-ep-3-wp-0.1-gacc-32-gnm-1.0-FP16-mx-2048-v2.3

This model is a fine-tuned version of BounharAbdelaziz/Al-Atlas-LLM-0.5B on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4813
  • Bleu: 11.5004
  • Chrf: 35.6364
  • Ter: 104.5543
  • Gen Len: 1.0

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 128
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Bleu Chrf Ter Gen Len
2.3349 0.1156 20 0.8954 6.5229 25.1844 97.7052 1.0
1.8212 0.2313 40 0.5205 8.3863 30.7922 100.3183 1.0
1.6088 0.3469 60 0.4910 9.9375 32.6773 103.6373 1.0
1.4974 0.4626 80 0.4865 10.3046 33.5592 102.9448 1.0
1.4494 0.5782 100 0.4847 10.953 34.2959 102.3899 1.0
1.3977 0.6939 120 0.4831 10.9053 34.2735 104.1302 1.0
1.3694 0.8095 140 0.4793 11.0026 34.7101 105.4975 1.0
1.3454 0.9252 160 0.4760 11.2241 34.5451 104.612 1.0
1.2959 1.0463 180 0.4830 11.133 35.0556 105.4385 1.0
1.2836 1.1619 200 0.4860 11.3134 34.6022 104.1697 1.0
1.2706 1.2776 220 0.4835 10.936 34.8636 105.827 1.0
1.2087 1.3932 240 0.4832 11.114 34.9581 106.7259 1.0
1.1982 1.5089 260 0.4760 11.1626 35.0099 105.5238 1.0
1.2821 1.6245 280 0.4822 11.1749 34.9248 106.0043 1.0
1.2519 1.7402 300 0.4811 11.4891 35.3655 105.6169 1.0
1.2769 1.8558 320 0.4776 11.4816 35.2067 104.5519 1.0
1.2149 1.9714 340 0.4771 11.5422 35.434 103.5539 1.0
1.2203 2.0925 360 0.4828 11.5389 35.4972 104.2589 1.0
1.1668 2.2082 380 0.4811 11.5922 35.5947 103.9682 1.0
1.1519 2.3238 400 0.4807 11.4581 35.5341 104.5049 1.0
1.1886 2.4395 420 0.4820 11.5028 35.4251 104.3936 1.0
1.1762 2.5551 440 0.4839 11.5828 35.5995 104.6084 1.0
1.1789 2.6708 460 0.4818 11.5674 35.5663 104.4349 1.0
1.1594 2.7864 480 0.4811 11.534 35.6868 104.1138 1.0
1.2573 2.9021 500 0.4813 11.5004 35.6364 104.5543 1.0

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 2.21.0
  • Tokenizers 0.21.0
Downloads last month
10
Safetensors
Model size
494M params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for BounharAbdelaziz/Al-Atlas-LLM-0.5B-bs-4-lr-5e-05-ep-3-wp-0.1-gacc-32-gnm-1.0-FP16-mx-2048-v2.3

Base model

Qwen/Qwen2.5-0.5B
Finetuned
(1)
this model
Quantizations
1 model