prl90777
/

llama_3_3_20250903_2145

+---
+library_name: peft
+license: llama3.2
+base_model: meta-llama/Llama-3.2-3B
+tags:
+- base_model:adapter:meta-llama/Llama-3.2-3B
+- lora
+- transformers
+model-index:
+- name: llama_3_3_20250903_2145
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# llama_3_3_20250903_2145
+This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.3355
+- Map@3: 0.9371
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0002
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 64
+- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
+- lr_scheduler_type: linear
+- num_epochs: 3
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Map@3  |
+|:-------------:|:------:|:----:|:---------------:|:------:|
+| 17.9232       | 0.0523 | 20   | 1.4365          | 0.7168 |
+| 10.0651       | 0.1046 | 40   | 1.1210          | 0.7636 |
+| 9.1342        | 0.1569 | 60   | 1.0630          | 0.7616 |
+| 8.7455        | 0.2092 | 80   | 1.0319          | 0.7732 |
+| 8.0814        | 0.2615 | 100  | 0.9055          | 0.8084 |
+| 7.333         | 0.3138 | 120  | 0.8242          | 0.8219 |
+| 6.8603        | 0.3661 | 140  | 0.8413          | 0.8197 |
+| 6.3616        | 0.4184 | 160  | 0.8386          | 0.8224 |
+| 7.267         | 0.4707 | 180  | 0.8070          | 0.8276 |
+| 5.946         | 0.5230 | 200  | 0.7488          | 0.8428 |
+| 6.3872        | 0.5754 | 220  | 0.7623          | 0.8343 |
+| 5.9969        | 0.6277 | 240  | 0.6821          | 0.8597 |
+| 5.544         | 0.6800 | 260  | 0.6512          | 0.8564 |
+| 4.8356        | 0.7323 | 280  | 0.6462          | 0.8709 |
+| 5.6033        | 0.7846 | 300  | 0.5858          | 0.8815 |
+| 4.4918        | 0.8369 | 320  | 0.5837          | 0.8849 |
+| 4.9479        | 0.8892 | 340  | 0.5603          | 0.8880 |
+| 4.5659        | 0.9415 | 360  | 0.5243          | 0.8932 |
+| 4.3615        | 0.9938 | 380  | 0.5798          | 0.8881 |
+| 4.3143        | 1.0445 | 400  | 0.4902          | 0.8994 |
+| 3.6791        | 1.0968 | 420  | 0.5078          | 0.8991 |
+| 3.5985        | 1.1491 | 440  | 0.4904          | 0.9047 |
+| 3.5077        | 1.2014 | 460  | 0.4797          | 0.9075 |
+| 3.843         | 1.2537 | 480  | 0.4635          | 0.9085 |
+| 3.3767        | 1.3060 | 500  | 0.4548          | 0.9116 |
+| 3.8554        | 1.3583 | 520  | 0.4823          | 0.9043 |
+| 3.8529        | 1.4106 | 540  | 0.4927          | 0.9032 |
+| 3.4666        | 1.4629 | 560  | 0.4424          | 0.9138 |
+| 3.6173        | 1.5152 | 580  | 0.4326          | 0.9160 |
+| 3.3832        | 1.5675 | 600  | 0.4243          | 0.9176 |
+| 2.7451        | 1.6198 | 620  | 0.4521          | 0.9183 |
+| 2.9097        | 1.6721 | 640  | 0.3975          | 0.9219 |
+| 3.2222        | 1.7244 | 660  | 0.3934          | 0.9229 |
+| 3.2087        | 1.7767 | 680  | 0.4234          | 0.9186 |
+| 2.9231        | 1.8290 | 700  | 0.3970          | 0.9211 |
+| 2.7208        | 1.8813 | 720  | 0.3943          | 0.9211 |
+| 2.9979        | 1.9336 | 740  | 0.3821          | 0.9246 |
+| 2.9678        | 1.9859 | 760  | 0.3680          | 0.9301 |
+| 2.501         | 2.0366 | 780  | 0.3765          | 0.9271 |
+| 2.202         | 2.0889 | 800  | 0.3723          | 0.9302 |
+| 1.8267        | 2.1412 | 820  | 0.3923          | 0.9260 |
+| 2.313         | 2.1935 | 840  | 0.3710          | 0.9307 |
+| 2.0693        | 2.2458 | 860  | 0.3658          | 0.9299 |
+| 2.0435        | 2.2981 | 880  | 0.3746          | 0.9307 |
+| 1.9854        | 2.3504 | 900  | 0.4199          | 0.9277 |
+| 2.0134        | 2.4027 | 920  | 0.3675          | 0.9324 |
+| 1.7272        | 2.4551 | 940  | 0.3662          | 0.9314 |
+| 1.8824        | 2.5074 | 960  | 0.3755          | 0.9309 |
+| 1.8695        | 2.5597 | 980  | 0.3588          | 0.9340 |
+| 1.9778        | 2.6120 | 1000 | 0.3511          | 0.9356 |
+| 1.8434        | 2.6643 | 1020 | 0.3617          | 0.9341 |
+| 1.7754        | 2.7166 | 1040 | 0.3491          | 0.9350 |
+| 1.9125        | 2.7689 | 1060 | 0.3446          | 0.9350 |
+| 1.728         | 2.8212 | 1080 | 0.3439          | 0.9367 |
+| 1.9307        | 2.8735 | 1100 | 0.3379          | 0.9364 |
+| 1.828         | 2.9258 | 1120 | 0.3362          | 0.9373 |
+| 1.4855        | 2.9781 | 1140 | 0.3355          | 0.9371 |
+### Framework versions
+- PEFT 0.17.1
+- Transformers 4.56.0
+- Pytorch 2.8.0+cu126
+- Datasets 4.0.0
+- Tokenizers 0.22.0

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:379f6896778969e87c9132359bc95f2e8f4de68070fe90562e811cbee68aae69
 size 98106360

 version https://git-lfs.github.com/spec/v1
+oid sha256:f030c3ac1f22164488518b9b482bf7ed4e7c9fafd623f36a2da3965d64cee005
 size 98106360