TARARARAK
/

HGU_rulebook-Llama3.2-Bllossom-5B_fine-tuning-QLoRA-32_64_3

PEFT

Safetensors

Generated from Trainer

Model card Files Files and versions Community

TARARARAK commited on Apr 15

Commit

d7d59ee

verified ·

1 Parent(s): 936a028

Best model with eval_loss: 0.6388353109359741

Browse files

Files changed (2) hide show

README.md +82 -0
adapter_model.safetensors +1 -1

README.md ADDED Viewed

	@@ -0,0 +1,82 @@

+---
+base_model: Bllossom/llama-3.2-Korean-Bllossom-AICA-5B
+library_name: peft
+license: llama3.2
+tags:
+- generated_from_trainer
+model-index:
+- name: HGU_rulebook-Llama3.2-Bllossom-5B_fine-tuning-QLoRA-32_64_3
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# HGU_rulebook-Llama3.2-Bllossom-5B_fine-tuning-QLoRA-32_64_3
+This model is a fine-tuned version of [Bllossom/llama-3.2-Korean-Bllossom-AICA-5B](https://huggingface.co/Bllossom/llama-3.2-Korean-Bllossom-AICA-5B) on an unknown dataset.
+It achieves the following results on the evaluation set:
+- Loss: 5.6952
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 1e-05
+- train_batch_size: 2
+- eval_batch_size: 2
+- seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 16
+- optimizer: Use adamw_bnb_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: cosine_with_restarts
+- lr_scheduler_warmup_ratio: 0.1
+- training_steps: 942
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss |
+|:-------------:|:------:|:----:|:---------------:|
+| 12.459        | 0.2991 | 47   | 11.8428         |
+| 8.9527        | 0.5982 | 94   | 8.3608          |
+| 6.8438        | 0.8974 | 141  | 6.6652          |
+| 6.1272        | 1.1965 | 188  | 6.0176          |
+| 5.858         | 1.4956 | 235  | 5.8316          |
+| 5.7707        | 1.7947 | 282  | 5.7618          |
+| 5.7435        | 2.0939 | 329  | 5.7331          |
+| 5.7215        | 2.3930 | 376  | 5.7194          |
+| 5.7118        | 2.6921 | 423  | 5.7114          |
+| 5.7047        | 2.9912 | 470  | 5.7063          |
+| 5.703         | 3.2904 | 517  | 5.7027          |
+| 5.6973        | 3.5895 | 564  | 5.7001          |
+| 5.6959        | 3.8886 | 611  | 5.6985          |
+| 5.6921        | 4.1877 | 658  | 5.6972          |
+| 5.6968        | 4.4869 | 705  | 5.6965          |
+| 5.6954        | 4.7860 | 752  | 5.6959          |
+| 5.6958        | 5.0851 | 799  | 5.6955          |
+| 5.6937        | 5.3842 | 846  | 5.6953          |
+| 5.6915        | 5.6834 | 893  | 5.6952          |
+| 5.6951        | 5.9825 | 940  | 5.6952          |
+### Framework versions
+- PEFT 0.12.0
+- Transformers 4.46.2
+- Pytorch 2.0.1+cu118
+- Datasets 3.0.0
+- Tokenizers 0.20.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:e25d889f231fbaf96687734a85222974460480bd0e5e7efbc073fd493db7c129
 size 72135352

 version https://git-lfs.github.com/spec/v1
+oid sha256:8f7399cc9e6fc3ef753412680e3f47aa1d344c8b1f68503cfcb21db67a91a759
 size 72135352