Terjman-Supreme-v2.2
This model is a fine-tuned version of facebook/nllb-200-3.3B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.2058
- Bleu: 23.4245
- Chrf: 44.5703
- Ter: 78.1656
- Gen Len: 12.3071
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 1
- eval_batch_size: 1
- seed: 42
- gradient_accumulation_steps: 64
- total_train_batch_size: 64
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Ter | Gen Len |
---|---|---|---|---|---|---|---|
137.8105 | 0.0181 | 100 | 3.2305 | 5.2001 | 13.6145 | 118.7357 | 12.5635 |
97.2318 | 0.0361 | 200 | 2.5726 | 17.2165 | 35.9826 | 88.2951 | 12.4082 |
83.4952 | 0.0542 | 300 | 2.4521 | 17.3135 | 36.8477 | 87.0818 | 12.2729 |
75.0776 | 0.0723 | 400 | 2.3968 | 18.058 | 37.798 | 87.0603 | 12.4929 |
66.9757 | 0.0904 | 500 | 2.3446 | 20.1995 | 38.8098 | 84.8913 | 12.3459 |
63.6814 | 0.1084 | 600 | 2.2928 | 19.8012 | 38.2663 | 84.9302 | 12.3835 |
60.3928 | 0.1265 | 700 | 2.2272 | 19.4166 | 38.2539 | 84.0673 | 12.3035 |
56.197 | 0.1446 | 800 | 2.2304 | 20.3862 | 39.7791 | 86.3549 | 12.9435 |
55.4461 | 0.1627 | 900 | 2.2529 | 19.6843 | 38.5102 | 84.7924 | 12.4235 |
54.5164 | 0.1807 | 1000 | 2.1330 | 21.2121 | 40.5161 | 81.4403 | 12.2235 |
53.228 | 0.1988 | 1100 | 2.1817 | 20.4483 | 40.5389 | 82.9359 | 12.5094 |
51.4801 | 0.2169 | 1200 | 2.2008 | 21.6519 | 40.7718 | 82.0789 | 12.3435 |
51.5955 | 0.2350 | 1300 | 2.1723 | 19.8541 | 40.5091 | 83.7385 | 12.5271 |
51.072 | 0.2530 | 1400 | 2.2120 | 20.0968 | 39.0978 | 95.2042 | 12.7388 |
51.3383 | 0.2711 | 1500 | 2.2069 | 21.9455 | 41.6071 | 113.2387 | 13.4894 |
51.1281 | 0.2892 | 1600 | 2.2140 | 19.6691 | 38.7329 | 107.2516 | 12.2718 |
50.6391 | 0.3072 | 1700 | 2.1903 | 20.9862 | 40.6458 | 87.1593 | 12.6706 |
50.8716 | 0.3253 | 1800 | 2.1774 | 20.9512 | 40.8041 | 83.8517 | 12.5753 |
49.1451 | 0.3434 | 1900 | 2.1281 | 21.7404 | 41.1935 | 81.2223 | 12.3718 |
48.6002 | 0.3615 | 2000 | 2.1656 | 21.3734 | 40.6663 | 81.527 | 12.2447 |
46.7725 | 0.3795 | 2100 | 2.1560 | 21.3878 | 40.2352 | 82.5294 | 12.4776 |
46.1724 | 0.3976 | 2200 | 2.1248 | 21.6288 | 40.8181 | 82.571 | 12.6271 |
45.3777 | 0.4157 | 2300 | 2.1343 | 21.8915 | 41.1995 | 82.9489 | 12.5776 |
44.7859 | 0.4338 | 2400 | 2.0903 | 22.322 | 41.9163 | 80.283 | 12.3718 |
43.8804 | 0.4518 | 2500 | 2.1051 | 22.3484 | 41.6646 | 80.6631 | 12.4106 |
42.9552 | 0.4699 | 2600 | 2.1127 | 21.3378 | 40.9099 | 81.9978 | 12.4812 |
42.8838 | 0.4880 | 2700 | 2.0602 | 20.8244 | 41.1047 | 82.7353 | 12.44 |
41.5548 | 0.5061 | 2800 | 2.0623 | 22.2229 | 41.5223 | 87.224 | 12.6024 |
41.6697 | 0.5241 | 2900 | 2.0682 | 23.1842 | 42.8083 | 81.7717 | 12.4788 |
41.5065 | 0.5422 | 3000 | 2.0393 | 22.3557 | 41.6164 | 79.3085 | 12.3718 |
41.6211 | 0.5603 | 3100 | 2.0054 | 22.2312 | 41.8033 | 80.2008 | 12.2647 |
41.248 | 0.5783 | 3200 | 2.0231 | 20.8397 | 41.33 | 95.135 | 13.7929 |
39.1609 | 0.5964 | 3300 | 2.0002 | 22.7687 | 43.2316 | 81.6566 | 12.4506 |
38.9398 | 0.6145 | 3400 | 2.0289 | 21.7436 | 41.1019 | 82.7019 | 12.4176 |
38.3047 | 0.6326 | 3500 | 2.0109 | 22.8019 | 41.9596 | 94.4557 | 12.8129 |
38.268 | 0.6506 | 3600 | 2.0201 | 24.4653 | 44.0864 | 80.5792 | 12.6435 |
38.5439 | 0.6687 | 3700 | 2.0134 | 24.1665 | 43.2041 | 78.6761 | 12.3188 |
37.7287 | 0.6868 | 3800 | 2.0260 | 24.3768 | 43.6489 | 81.3409 | 12.7047 |
36.1784 | 0.7049 | 3900 | 1.9660 | 23.9311 | 43.3869 | 79.2676 | 12.4035 |
36.6123 | 0.7229 | 4000 | 1.9625 | 22.976 | 43.4062 | 79.7419 | 12.3976 |
36.7558 | 0.7410 | 4100 | 1.9621 | 22.9323 | 43.1629 | 80.1544 | 12.3635 |
36.6426 | 0.7591 | 4200 | 1.9682 | 23.0305 | 43.0949 | 79.4987 | 12.32 |
36.3618 | 0.7772 | 4300 | 1.9667 | 22.5987 | 42.8116 | 79.1677 | 12.2047 |
36.0427 | 0.7952 | 4400 | 1.9333 | 23.7835 | 42.8368 | 78.3063 | 12.2541 |
35.1977 | 0.8133 | 4500 | 1.9455 | 24.1571 | 43.0898 | 78.0521 | 12.24 |
34.9866 | 0.8314 | 4600 | 1.9282 | 23.5239 | 42.3893 | 78.3871 | 12.1388 |
34.66 | 0.8494 | 4700 | 1.9404 | 23.9171 | 43.2115 | 79.4609 | 12.3435 |
34.85 | 0.8675 | 4800 | 1.9518 | 23.698 | 42.9408 | 79.1596 | 12.2612 |
34.9682 | 0.8856 | 4900 | 1.8821 | 23.4893 | 43.8764 | 78.5991 | 12.3012 |
33.784 | 0.9037 | 5000 | 1.8985 | 23.6473 | 43.8604 | 78.1807 | 12.2612 |
33.8055 | 0.9217 | 5100 | 1.9105 | 23.6402 | 43.5989 | 78.5758 | 12.3553 |
33.9263 | 0.9398 | 5200 | 1.8903 | 24.5462 | 44.4274 | 78.4294 | 12.4588 |
34.1902 | 0.9579 | 5300 | 1.8965 | 23.8897 | 43.9914 | 78.6182 | 12.4647 |
33.1319 | 0.9760 | 5400 | 1.8967 | 23.4588 | 43.4492 | 77.2763 | 12.3094 |
32.6819 | 0.9940 | 5500 | 1.8980 | 23.1381 | 43.7181 | 79.4987 | 12.5306 |
25.6872 | 1.0121 | 5600 | 1.9989 | 24.3884 | 43.8866 | 76.2215 | 12.2565 |
26.2497 | 1.0302 | 5700 | 2.0097 | 23.0962 | 42.8724 | 78.4643 | 12.2953 |
25.0397 | 1.0483 | 5800 | 2.0168 | 23.3407 | 43.1597 | 78.6203 | 12.2941 |
26.1732 | 1.0663 | 5900 | 2.0115 | 23.4883 | 43.5452 | 79.9678 | 12.6753 |
25.9725 | 1.0844 | 6000 | 2.0090 | 23.1359 | 43.4672 | 79.7718 | 12.6659 |
25.6574 | 1.1025 | 6100 | 2.0034 | 22.8505 | 43.1392 | 80.5169 | 12.3859 |
25.5012 | 1.1205 | 6200 | 2.0140 | 23.7912 | 43.4571 | 77.5566 | 12.2776 |
25.3003 | 1.1386 | 6300 | 2.0215 | 23.9161 | 43.6086 | 78.0778 | 12.34 |
24.6921 | 1.1567 | 6400 | 1.9851 | 23.8938 | 44.0363 | 79.4653 | 12.4859 |
25.0079 | 1.1748 | 6500 | 1.9878 | 23.8998 | 43.4735 | 78.9155 | 12.3506 |
25.0747 | 1.1928 | 6600 | 1.9882 | 22.9832 | 43.0514 | 79.9574 | 12.36 |
25.2562 | 1.2109 | 6700 | 2.0077 | 22.6288 | 43.0805 | 80.337 | 12.3835 |
25.1187 | 1.2290 | 6800 | 1.9651 | 23.4281 | 43.7022 | 78.7561 | 12.3518 |
24.0937 | 1.2471 | 6900 | 1.9807 | 23.6022 | 43.5876 | 79.3789 | 12.2953 |
24.9507 | 1.2651 | 7000 | 1.9347 | 23.6003 | 43.4728 | 78.0933 | 12.2659 |
24.5099 | 1.2832 | 7100 | 1.9686 | 23.7821 | 43.7423 | 78.1923 | 12.3659 |
24.3604 | 1.3013 | 7200 | 2.0012 | 23.5141 | 43.7252 | 78.5037 | 12.3541 |
24.2341 | 1.3194 | 7300 | 1.9883 | 23.6762 | 43.7825 | 77.999 | 12.3082 |
23.8904 | 1.3374 | 7400 | 1.9842 | 23.9927 | 43.6138 | 77.48 | 12.3094 |
24.7891 | 1.3555 | 7500 | 1.9782 | 23.4309 | 43.695 | 78.4042 | 12.2541 |
24.3329 | 1.3736 | 7600 | 1.9914 | 23.3102 | 43.8064 | 79.1631 | 12.3871 |
24.4164 | 1.3917 | 7700 | 1.9766 | 23.5369 | 43.4989 | 77.9728 | 12.2565 |
24.1632 | 1.4097 | 7800 | 1.9629 | 23.5813 | 43.9428 | 78.335 | 12.2624 |
24.3018 | 1.4278 | 7900 | 1.9478 | 24.2417 | 44.5599 | 77.6036 | 12.2635 |
23.5833 | 1.4459 | 8000 | 1.9401 | 24.0363 | 44.3674 | 77.6119 | 12.2388 |
24.0873 | 1.4639 | 8100 | 1.9250 | 24.0754 | 44.316 | 77.1101 | 12.2 |
23.8089 | 1.4820 | 8200 | 1.9356 | 24.4306 | 45.0766 | 77.1962 | 12.3188 |
23.6771 | 1.5001 | 8300 | 1.9320 | 23.9639 | 44.5074 | 77.3474 | 12.2435 |
23.9428 | 1.5182 | 8400 | 1.9510 | 23.3543 | 44.2183 | 78.931 | 12.3071 |
24.1568 | 1.5362 | 8500 | 1.9518 | 23.6985 | 44.5651 | 78.9739 | 12.3647 |
23.7392 | 1.5543 | 8600 | 1.9338 | 24.3444 | 45.0775 | 77.3414 | 12.3259 |
23.8515 | 1.5724 | 8700 | 1.9474 | 24.1262 | 44.792 | 77.9217 | 12.3071 |
23.0664 | 1.5905 | 8800 | 1.9473 | 24.0164 | 44.7487 | 78.0097 | 12.3224 |
23.3824 | 1.6085 | 8900 | 1.9340 | 23.806 | 44.6919 | 77.9177 | 12.2953 |
22.7727 | 1.6266 | 9000 | 1.9716 | 23.4272 | 44.7997 | 78.3186 | 12.3212 |
23.4696 | 1.6447 | 9100 | 1.9493 | 23.7553 | 44.5676 | 78.0819 | 12.3424 |
22.7403 | 1.6628 | 9200 | 1.9668 | 24.0244 | 44.7161 | 77.4703 | 12.3459 |
22.8108 | 1.6808 | 9300 | 1.9469 | 24.3704 | 45.1849 | 77.0614 | 12.2741 |
23.8398 | 1.6989 | 9400 | 1.9339 | 23.8243 | 44.5131 | 77.6411 | 12.2659 |
22.6805 | 1.7170 | 9500 | 1.9319 | 24.1688 | 44.8033 | 77.717 | 12.2765 |
22.7274 | 1.7350 | 9600 | 1.9525 | 23.845 | 44.87 | 77.9104 | 12.2859 |
22.9588 | 1.7531 | 9700 | 1.9431 | 24.1488 | 44.7974 | 78.0564 | 12.2918 |
22.9954 | 1.7712 | 9800 | 1.9482 | 23.8619 | 45.008 | 77.9592 | 12.3271 |
22.6917 | 1.7893 | 9900 | 1.9385 | 24.1254 | 45.0475 | 77.1275 | 12.3012 |
22.7402 | 1.8073 | 10000 | 1.9290 | 24.552 | 45.4205 | 76.0642 | 12.2541 |
23.087 | 1.8254 | 10100 | 1.9144 | 24.2509 | 45.0886 | 77.0239 | 12.3071 |
23.0166 | 1.8435 | 10200 | 1.9263 | 24.2323 | 45.0173 | 77.1054 | 12.3071 |
22.3922 | 1.8616 | 10300 | 1.9365 | 23.3072 | 44.4322 | 78.3998 | 12.3024 |
22.9795 | 1.8796 | 10400 | 1.9352 | 23.2202 | 44.4076 | 79.067 | 12.3212 |
22.3474 | 1.8977 | 10500 | 1.9440 | 22.9263 | 44.2298 | 79.3766 | 12.3412 |
22.1394 | 1.9158 | 10600 | 1.9303 | 22.9458 | 44.2676 | 78.7353 | 12.3153 |
22.9308 | 1.9339 | 10700 | 1.9329 | 23.2577 | 44.4509 | 78.2277 | 12.3329 |
22.2057 | 1.9519 | 10800 | 1.9297 | 23.0115 | 44.0851 | 79.0073 | 12.3247 |
22.8086 | 1.9700 | 10900 | 1.9351 | 23.083 | 44.244 | 79.5701 | 12.3565 |
22.4432 | 1.9881 | 11000 | 1.9423 | 23.3255 | 44.3864 | 78.756 | 12.3188 |
19.0883 | 2.0061 | 11100 | 2.0378 | 23.0833 | 44.3251 | 78.859 | 12.3118 |
17.6663 | 2.0242 | 11200 | 2.1177 | 23.1699 | 44.2894 | 78.8729 | 12.2965 |
18.1139 | 2.0423 | 11300 | 2.1363 | 22.8201 | 44.1824 | 79.4578 | 12.3035 |
17.3899 | 2.0604 | 11400 | 2.1522 | 23.1964 | 44.4594 | 78.4838 | 12.3282 |
17.4895 | 2.0784 | 11500 | 2.1518 | 23.0312 | 44.4584 | 78.8593 | 12.2659 |
17.5904 | 2.0965 | 11600 | 2.1763 | 22.7354 | 44.1317 | 78.8398 | 12.2941 |
17.2585 | 2.1146 | 11700 | 2.1681 | 22.9018 | 44.1018 | 84.3444 | 12.5188 |
17.7048 | 2.1327 | 11800 | 2.1767 | 22.9439 | 44.2815 | 84.3079 | 12.5212 |
17.3691 | 2.1507 | 11900 | 2.1812 | 22.8058 | 44.0874 | 84.2541 | 12.5176 |
17.4433 | 2.1688 | 12000 | 2.1851 | 22.7803 | 44.1106 | 84.8505 | 12.5318 |
17.4181 | 2.1869 | 12100 | 2.1836 | 22.9985 | 44.115 | 84.5472 | 12.5353 |
17.7475 | 2.2050 | 12200 | 2.1929 | 23.1362 | 44.2473 | 84.1141 | 12.4976 |
17.3206 | 2.2230 | 12300 | 2.1860 | 23.1621 | 44.3407 | 78.4548 | 12.2988 |
17.5009 | 2.2411 | 12400 | 2.1877 | 22.9871 | 44.3105 | 84.2672 | 12.5388 |
17.3298 | 2.2592 | 12500 | 2.1906 | 23.0868 | 44.4643 | 84.3887 | 12.5494 |
17.3273 | 2.2772 | 12600 | 2.1933 | 23.0713 | 44.5811 | 78.539 | 12.3224 |
17.7982 | 2.2953 | 12700 | 2.2007 | 23.0106 | 44.3926 | 84.531 | 12.5471 |
17.4851 | 2.3134 | 12800 | 2.1865 | 23.1507 | 44.5556 | 78.5757 | 12.3259 |
17.526 | 2.3315 | 12900 | 2.1850 | 23.1178 | 44.548 | 78.4484 | 12.2941 |
17.3512 | 2.3495 | 13000 | 2.2085 | 23.345 | 44.7801 | 78.1779 | 12.3129 |
16.9256 | 2.3676 | 13100 | 2.2143 | 23.5069 | 44.7383 | 78.0014 | 12.3094 |
17.5127 | 2.3857 | 13200 | 2.2020 | 23.5399 | 44.7461 | 77.9207 | 12.3035 |
16.9278 | 2.4038 | 13300 | 2.2075 | 23.3908 | 44.6205 | 78.2937 | 12.3365 |
16.9606 | 2.4218 | 13400 | 2.2032 | 23.4848 | 44.7578 | 77.8764 | 12.3271 |
17.1626 | 2.4399 | 13500 | 2.2008 | 23.2227 | 44.2244 | 78.361 | 12.3282 |
17.0724 | 2.4580 | 13600 | 2.2089 | 23.4288 | 44.5322 | 78.1828 | 12.2976 |
17.3127 | 2.4761 | 13700 | 2.1986 | 23.5571 | 44.7397 | 77.867 | 12.3071 |
17.5657 | 2.4941 | 13800 | 2.2048 | 23.6261 | 44.9271 | 78.1384 | 12.3118 |
17.2787 | 2.5122 | 13900 | 2.2031 | 23.3188 | 44.5084 | 78.1988 | 12.3047 |
17.4854 | 2.5303 | 14000 | 2.2063 | 23.2519 | 44.546 | 78.171 | 12.3094 |
17.2063 | 2.5483 | 14100 | 2.2056 | 23.1215 | 44.5185 | 78.1912 | 12.3035 |
17.3853 | 2.5664 | 14200 | 2.2023 | 23.307 | 44.7458 | 78.0718 | 12.2929 |
17.774 | 2.5845 | 14300 | 2.1994 | 23.3031 | 44.8097 | 77.9967 | 12.3059 |
16.6834 | 2.6026 | 14400 | 2.2036 | 23.4314 | 44.7324 | 77.9656 | 12.3082 |
17.1441 | 2.6206 | 14500 | 2.2020 | 23.3211 | 44.8156 | 77.9575 | 12.3141 |
17.6746 | 2.6387 | 14600 | 2.2064 | 23.2103 | 44.6337 | 78.2396 | 12.3294 |
17.0056 | 2.6568 | 14700 | 2.2077 | 23.2063 | 44.6551 | 77.9941 | 12.2918 |
17.1054 | 2.6749 | 14800 | 2.2078 | 23.4498 | 44.7412 | 78.0114 | 12.2976 |
17.5818 | 2.6929 | 14900 | 2.2031 | 23.3876 | 44.7344 | 77.9263 | 12.2894 |
17.2745 | 2.7110 | 15000 | 2.2031 | 23.2746 | 44.6287 | 78.1371 | 12.2929 |
17.2902 | 2.7291 | 15100 | 2.2045 | 23.3318 | 44.731 | 78.1528 | 12.3024 |
17.3091 | 2.7472 | 15200 | 2.2080 | 23.339 | 44.6857 | 78.186 | 12.3059 |
17.405 | 2.7652 | 15300 | 2.2060 | 23.3269 | 44.6893 | 78.1277 | 12.2906 |
17.5894 | 2.7833 | 15400 | 2.2071 | 23.2891 | 44.7165 | 78.1082 | 12.2965 |
17.0158 | 2.8014 | 15500 | 2.2038 | 23.4179 | 44.7075 | 78.0389 | 12.2847 |
17.4972 | 2.8194 | 15600 | 2.2068 | 23.3515 | 44.7074 | 78.1332 | 12.2941 |
17.2833 | 2.8375 | 15700 | 2.2061 | 23.3688 | 44.7259 | 78.0318 | 12.3 |
17.0573 | 2.8556 | 15800 | 2.2068 | 23.4313 | 44.7199 | 77.9653 | 12.2976 |
17.4166 | 2.8737 | 15900 | 2.2056 | 23.4953 | 44.6781 | 78.0423 | 12.3106 |
17.0984 | 2.8917 | 16000 | 2.2057 | 23.3163 | 44.6874 | 78.0286 | 12.2906 |
17.5987 | 2.9098 | 16100 | 2.2047 | 23.3148 | 44.6769 | 78.0154 | 12.3 |
17.2003 | 2.9279 | 16200 | 2.2053 | 23.3496 | 44.6994 | 77.9406 | 12.2988 |
17.1631 | 2.9460 | 16300 | 2.2049 | 23.2588 | 44.619 | 77.9645 | 12.2941 |
17.3428 | 2.9640 | 16400 | 2.2063 | 23.2298 | 44.5834 | 78.0989 | 12.2965 |
17.3013 | 2.9821 | 16500 | 2.2058 | 23.4245 | 44.5703 | 78.1656 | 12.3071 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 6
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BounharAbdelaziz/Terjman-Supreme-v2.2
Base model
facebook/nllb-200-3.3B