finetuned smol 220M
Collection
smol_llama 220M fine-tunes we did
•
6 items
•
Updated
•
1
This model is a fine-tuned version of amazingvince/zephyr-220m-sft-full on the None dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
---|---|---|---|---|---|---|---|---|---|---|---|
0.6906 | 0.03 | 100 | 0.6932 | 0.0008 | 0.0007 | 0.4860 | 0.0002 | -437.9984 | -549.3683 | -4.0893 | -4.0515 |
0.6844 | 0.05 | 200 | 0.6855 | 0.0323 | 0.0173 | 0.5640 | 0.0150 | -437.8319 | -549.0540 | -4.0871 | -4.0501 |
0.6685 | 0.08 | 300 | 0.6675 | 0.1075 | 0.0537 | 0.6160 | 0.0538 | -437.4682 | -548.3016 | -4.0788 | -4.0432 |
0.6579 | 0.1 | 400 | 0.6426 | 0.2153 | 0.0941 | 0.6430 | 0.1212 | -437.0637 | -547.2234 | -4.0645 | -4.0309 |
0.6331 | 0.13 | 500 | 0.6241 | 0.2980 | 0.1106 | 0.6430 | 0.1874 | -436.8989 | -546.3970 | -4.0525 | -4.0221 |
0.6229 | 0.15 | 600 | 0.6138 | 0.3428 | 0.1103 | 0.6580 | 0.2325 | -436.9023 | -545.9487 | -4.0402 | -4.0116 |
0.6008 | 0.18 | 700 | 0.6053 | 0.3822 | 0.0970 | 0.6560 | 0.2852 | -437.0354 | -545.5550 | -4.0301 | -4.0042 |
0.5751 | 0.21 | 800 | 0.5998 | 0.4077 | 0.0879 | 0.6540 | 0.3198 | -437.1260 | -545.2994 | -4.0359 | -4.0099 |
0.6485 | 0.23 | 900 | 0.5922 | 0.4208 | 0.0655 | 0.6600 | 0.3553 | -437.3501 | -545.1683 | -4.0167 | -3.9936 |
0.6164 | 0.26 | 1000 | 0.5880 | 0.4046 | 0.0287 | 0.6620 | 0.3759 | -437.7182 | -545.3309 | -4.0092 | -3.9869 |
0.6225 | 0.28 | 1100 | 0.5852 | 0.4058 | 0.0110 | 0.6680 | 0.3948 | -437.8951 | -545.3189 | -4.0240 | -3.9984 |
0.6289 | 0.31 | 1200 | 0.5824 | 0.4127 | 0.0078 | 0.6670 | 0.4048 | -437.9265 | -545.2498 | -4.0253 | -3.9994 |
0.5818 | 0.34 | 1300 | 0.5818 | 0.4222 | 0.0097 | 0.6680 | 0.4125 | -437.9080 | -545.1544 | -4.0212 | -3.9953 |
0.567 | 0.36 | 1400 | 0.5797 | 0.4098 | -0.0141 | 0.6730 | 0.4238 | -438.1456 | -545.2791 | -4.0333 | -4.0062 |
0.5659 | 0.39 | 1500 | 0.5790 | 0.4204 | -0.0154 | 0.6780 | 0.4358 | -438.1591 | -545.1725 | -4.0245 | -3.9963 |
0.5993 | 0.41 | 1600 | 0.5783 | 0.4161 | -0.0285 | 0.6720 | 0.4446 | -438.2904 | -545.2161 | -4.0185 | -3.9907 |
0.5999 | 0.44 | 1700 | 0.5767 | 0.4067 | -0.0468 | 0.6840 | 0.4535 | -438.4729 | -545.3095 | -4.0207 | -3.9935 |
0.6004 | 0.46 | 1800 | 0.5731 | 0.4233 | -0.0394 | 0.6830 | 0.4627 | -438.3991 | -545.1437 | -4.0219 | -3.9944 |
0.5349 | 0.49 | 1900 | 0.5720 | 0.4285 | -0.0429 | 0.6830 | 0.4714 | -438.4335 | -545.0914 | -4.0295 | -4.0012 |
0.5377 | 0.52 | 2000 | 0.5702 | 0.4255 | -0.0540 | 0.6850 | 0.4795 | -438.5449 | -545.1220 | -4.0290 | -4.0009 |
0.4988 | 0.54 | 2100 | 0.5713 | 0.4347 | -0.0548 | 0.6840 | 0.4895 | -438.5533 | -545.0299 | -4.0317 | -4.0039 |
0.6093 | 0.57 | 2200 | 0.5706 | 0.4464 | -0.0456 | 0.6810 | 0.4920 | -438.4607 | -544.9128 | -4.0288 | -4.0014 |
0.5356 | 0.59 | 2300 | 0.5689 | 0.4484 | -0.0486 | 0.6880 | 0.4971 | -438.4912 | -544.8922 | -4.0257 | -3.9986 |
0.5753 | 0.62 | 2400 | 0.5681 | 0.4596 | -0.0441 | 0.6850 | 0.5037 | -438.4457 | -544.7802 | -4.0100 | -3.9846 |
0.5709 | 0.65 | 2500 | 0.5673 | 0.4693 | -0.0387 | 0.6910 | 0.5081 | -438.3924 | -544.6835 | -4.0100 | -3.9849 |
0.5565 | 0.67 | 2600 | 0.5665 | 0.4692 | -0.0401 | 0.6820 | 0.5092 | -438.4054 | -544.6850 | -4.0096 | -3.9843 |
0.585 | 0.7 | 2700 | 0.5650 | 0.4780 | -0.0351 | 0.6940 | 0.5131 | -438.3558 | -544.5962 | -4.0074 | -3.9820 |
0.5883 | 0.72 | 2800 | 0.5670 | 0.4914 | -0.0151 | 0.6880 | 0.5066 | -438.1562 | -544.4624 | -3.9894 | -3.9669 |
0.624 | 0.75 | 2900 | 0.5663 | 0.4877 | -0.0191 | 0.6840 | 0.5068 | -438.1958 | -544.4997 | -3.9935 | -3.9705 |
0.5347 | 0.77 | 3000 | 0.5644 | 0.4757 | -0.0335 | 0.6850 | 0.5092 | -438.3401 | -544.6199 | -4.0019 | -3.9777 |
0.5837 | 0.8 | 3100 | 0.5637 | 0.4783 | -0.0302 | 0.6830 | 0.5085 | -438.3073 | -544.5936 | -3.9976 | -3.9742 |
0.5293 | 0.83 | 3200 | 0.5634 | 0.4715 | -0.0363 | 0.6890 | 0.5078 | -438.3679 | -544.6616 | -4.0023 | -3.9778 |
0.5128 | 0.85 | 3300 | 0.5620 | 0.4745 | -0.0387 | 0.6880 | 0.5131 | -438.3917 | -544.6319 | -4.0053 | -3.9804 |
0.6204 | 0.88 | 3400 | 0.5625 | 0.4679 | -0.0442 | 0.6860 | 0.5121 | -438.4469 | -544.6978 | -4.0067 | -3.9815 |
0.5469 | 0.9 | 3500 | 0.5618 | 0.4612 | -0.0491 | 0.6860 | 0.5102 | -438.4956 | -544.7651 | -4.0098 | -3.9843 |
0.5807 | 0.93 | 3600 | 0.5615 | 0.4675 | -0.0454 | 0.6890 | 0.5129 | -438.4584 | -544.7015 | -4.0068 | -3.9818 |
0.5265 | 0.96 | 3700 | 0.5620 | 0.4675 | -0.0435 | 0.6880 | 0.5110 | -438.4403 | -544.7019 | -4.0082 | -3.9833 |
0.5484 | 0.98 | 3800 | 0.5615 | 0.4685 | -0.0449 | 0.6930 | 0.5133 | -438.4536 | -544.6919 | -4.0103 | -3.9851 |
https://wandb.ai/amazingvince/huggingface/runs/z71h0hc3?workspace=user-amazingvince