prl90777 commited on
Commit
7313d68
·
verified ·
1 Parent(s): c23418d

Model save

Browse files
Files changed (2) hide show
  1. README.md +120 -0
  2. adapter_model.safetensors +1 -1
README.md ADDED
@@ -0,0 +1,120 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: peft
3
+ license: llama3.2
4
+ base_model: meta-llama/Llama-3.2-3B
5
+ tags:
6
+ - base_model:adapter:meta-llama/Llama-3.2-3B
7
+ - lora
8
+ - transformers
9
+ model-index:
10
+ - name: llama_3_3_20250903_2145
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # llama_3_3_20250903_2145
18
+
19
+ This model is a fine-tuned version of [meta-llama/Llama-3.2-3B](https://huggingface.co/meta-llama/Llama-3.2-3B) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.3355
22
+ - Map@3: 0.9371
23
+
24
+ ## Model description
25
+
26
+ More information needed
27
+
28
+ ## Intended uses & limitations
29
+
30
+ More information needed
31
+
32
+ ## Training and evaluation data
33
+
34
+ More information needed
35
+
36
+ ## Training procedure
37
+
38
+ ### Training hyperparameters
39
+
40
+ The following hyperparameters were used during training:
41
+ - learning_rate: 0.0002
42
+ - train_batch_size: 8
43
+ - eval_batch_size: 8
44
+ - seed: 42
45
+ - gradient_accumulation_steps: 8
46
+ - total_train_batch_size: 64
47
+ - optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
48
+ - lr_scheduler_type: linear
49
+ - num_epochs: 3
50
+
51
+ ### Training results
52
+
53
+ | Training Loss | Epoch | Step | Validation Loss | Map@3 |
54
+ |:-------------:|:------:|:----:|:---------------:|:------:|
55
+ | 17.9232 | 0.0523 | 20 | 1.4365 | 0.7168 |
56
+ | 10.0651 | 0.1046 | 40 | 1.1210 | 0.7636 |
57
+ | 9.1342 | 0.1569 | 60 | 1.0630 | 0.7616 |
58
+ | 8.7455 | 0.2092 | 80 | 1.0319 | 0.7732 |
59
+ | 8.0814 | 0.2615 | 100 | 0.9055 | 0.8084 |
60
+ | 7.333 | 0.3138 | 120 | 0.8242 | 0.8219 |
61
+ | 6.8603 | 0.3661 | 140 | 0.8413 | 0.8197 |
62
+ | 6.3616 | 0.4184 | 160 | 0.8386 | 0.8224 |
63
+ | 7.267 | 0.4707 | 180 | 0.8070 | 0.8276 |
64
+ | 5.946 | 0.5230 | 200 | 0.7488 | 0.8428 |
65
+ | 6.3872 | 0.5754 | 220 | 0.7623 | 0.8343 |
66
+ | 5.9969 | 0.6277 | 240 | 0.6821 | 0.8597 |
67
+ | 5.544 | 0.6800 | 260 | 0.6512 | 0.8564 |
68
+ | 4.8356 | 0.7323 | 280 | 0.6462 | 0.8709 |
69
+ | 5.6033 | 0.7846 | 300 | 0.5858 | 0.8815 |
70
+ | 4.4918 | 0.8369 | 320 | 0.5837 | 0.8849 |
71
+ | 4.9479 | 0.8892 | 340 | 0.5603 | 0.8880 |
72
+ | 4.5659 | 0.9415 | 360 | 0.5243 | 0.8932 |
73
+ | 4.3615 | 0.9938 | 380 | 0.5798 | 0.8881 |
74
+ | 4.3143 | 1.0445 | 400 | 0.4902 | 0.8994 |
75
+ | 3.6791 | 1.0968 | 420 | 0.5078 | 0.8991 |
76
+ | 3.5985 | 1.1491 | 440 | 0.4904 | 0.9047 |
77
+ | 3.5077 | 1.2014 | 460 | 0.4797 | 0.9075 |
78
+ | 3.843 | 1.2537 | 480 | 0.4635 | 0.9085 |
79
+ | 3.3767 | 1.3060 | 500 | 0.4548 | 0.9116 |
80
+ | 3.8554 | 1.3583 | 520 | 0.4823 | 0.9043 |
81
+ | 3.8529 | 1.4106 | 540 | 0.4927 | 0.9032 |
82
+ | 3.4666 | 1.4629 | 560 | 0.4424 | 0.9138 |
83
+ | 3.6173 | 1.5152 | 580 | 0.4326 | 0.9160 |
84
+ | 3.3832 | 1.5675 | 600 | 0.4243 | 0.9176 |
85
+ | 2.7451 | 1.6198 | 620 | 0.4521 | 0.9183 |
86
+ | 2.9097 | 1.6721 | 640 | 0.3975 | 0.9219 |
87
+ | 3.2222 | 1.7244 | 660 | 0.3934 | 0.9229 |
88
+ | 3.2087 | 1.7767 | 680 | 0.4234 | 0.9186 |
89
+ | 2.9231 | 1.8290 | 700 | 0.3970 | 0.9211 |
90
+ | 2.7208 | 1.8813 | 720 | 0.3943 | 0.9211 |
91
+ | 2.9979 | 1.9336 | 740 | 0.3821 | 0.9246 |
92
+ | 2.9678 | 1.9859 | 760 | 0.3680 | 0.9301 |
93
+ | 2.501 | 2.0366 | 780 | 0.3765 | 0.9271 |
94
+ | 2.202 | 2.0889 | 800 | 0.3723 | 0.9302 |
95
+ | 1.8267 | 2.1412 | 820 | 0.3923 | 0.9260 |
96
+ | 2.313 | 2.1935 | 840 | 0.3710 | 0.9307 |
97
+ | 2.0693 | 2.2458 | 860 | 0.3658 | 0.9299 |
98
+ | 2.0435 | 2.2981 | 880 | 0.3746 | 0.9307 |
99
+ | 1.9854 | 2.3504 | 900 | 0.4199 | 0.9277 |
100
+ | 2.0134 | 2.4027 | 920 | 0.3675 | 0.9324 |
101
+ | 1.7272 | 2.4551 | 940 | 0.3662 | 0.9314 |
102
+ | 1.8824 | 2.5074 | 960 | 0.3755 | 0.9309 |
103
+ | 1.8695 | 2.5597 | 980 | 0.3588 | 0.9340 |
104
+ | 1.9778 | 2.6120 | 1000 | 0.3511 | 0.9356 |
105
+ | 1.8434 | 2.6643 | 1020 | 0.3617 | 0.9341 |
106
+ | 1.7754 | 2.7166 | 1040 | 0.3491 | 0.9350 |
107
+ | 1.9125 | 2.7689 | 1060 | 0.3446 | 0.9350 |
108
+ | 1.728 | 2.8212 | 1080 | 0.3439 | 0.9367 |
109
+ | 1.9307 | 2.8735 | 1100 | 0.3379 | 0.9364 |
110
+ | 1.828 | 2.9258 | 1120 | 0.3362 | 0.9373 |
111
+ | 1.4855 | 2.9781 | 1140 | 0.3355 | 0.9371 |
112
+
113
+
114
+ ### Framework versions
115
+
116
+ - PEFT 0.17.1
117
+ - Transformers 4.56.0
118
+ - Pytorch 2.8.0+cu126
119
+ - Datasets 4.0.0
120
+ - Tokenizers 0.22.0
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:379f6896778969e87c9132359bc95f2e8f4de68070fe90562e811cbee68aae69
3
  size 98106360
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f030c3ac1f22164488518b9b482bf7ed4e7c9fafd623f36a2da3965d64cee005
3
  size 98106360