HGU_rulebook-Llama3.2-Bllossom-5B_fine-tuning-QLoRA-8_16_5
This model is a fine-tuned version of Bllossom/llama-3.2-Korean-Bllossom-AICA-5B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 5.6999
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 2
- eval_batch_size: 2
- seed: 42
- gradient_accumulation_steps: 8
- total_train_batch_size: 16
- optimizer: Use adamw_bnb_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine_with_restarts
- lr_scheduler_warmup_ratio: 0.1
- training_steps: 1570
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
12.6394 | 0.4964 | 78 | 12.3957 |
9.2796 | 0.9928 | 156 | 8.8049 |
7.1727 | 1.4893 | 234 | 7.0101 |
6.2211 | 1.9857 | 312 | 6.1729 |
5.9124 | 2.4821 | 390 | 5.9015 |
5.8102 | 2.9785 | 468 | 5.8015 |
5.7634 | 3.4749 | 546 | 5.7552 |
5.7337 | 3.9714 | 624 | 5.7330 |
5.7222 | 4.4678 | 702 | 5.7211 |
5.7148 | 4.9642 | 780 | 5.7148 |
5.7104 | 5.4606 | 858 | 5.7101 |
5.7054 | 5.9570 | 936 | 5.7066 |
5.7041 | 6.4535 | 1014 | 5.7043 |
5.6986 | 6.9499 | 1092 | 5.7024 |
5.7033 | 7.4463 | 1170 | 5.7014 |
5.7003 | 7.9427 | 1248 | 5.7006 |
5.7 | 8.4391 | 1326 | 5.7001 |
5.6966 | 8.9356 | 1404 | 5.6999 |
5.6971 | 9.4320 | 1482 | 5.6998 |
5.6974 | 9.9284 | 1560 | 5.6999 |
Framework versions
- PEFT 0.12.0
- Transformers 4.46.2
- Pytorch 2.0.1+cu118
- Datasets 3.0.0
- Tokenizers 0.20.1
- Downloads last month
- 3
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for TARARARAK/HGU_rulebook-Llama3.2-Bllossom-5B_fine-tuning-QLoRA-8_16_5
Base model
Bllossom/llama-3.2-Korean-Bllossom-AICA-5B