Terjman-Ultra-v2.2
This model is a fine-tuned version of facebook/nllb-200-1.3B on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 2.0034
- Bleu: 24.9877
- Chrf: 44.6972
- Ter: 77.1984
- Gen Len: 12.3388
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0005
- train_batch_size: 4
- eval_batch_size: 4
- seed: 42
- gradient_accumulation_steps: 32
- total_train_batch_size: 128
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- num_epochs: 3
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Chrf | Ter | Gen Len |
---|---|---|---|---|---|---|---|
70.7656 | 0.0361 | 100 | 2.9806 | 12.9152 | 30.9913 | 98.5067 | 12.7812 |
52.8738 | 0.0723 | 200 | 2.7797 | 15.5994 | 34.6105 | 93.0509 | 12.5988 |
45.6954 | 0.1084 | 300 | 2.6691 | 17.7178 | 36.6882 | 88.7668 | 12.5506 |
39.9565 | 0.1446 | 400 | 2.5662 | 17.5642 | 37.6628 | 90.0519 | 12.7071 |
36.7684 | 0.1807 | 500 | 2.4275 | 19.4366 | 39.1214 | 86.0378 | 12.5859 |
33.4308 | 0.2169 | 600 | 2.3610 | 20.5894 | 40.0986 | 85.4547 | 12.5965 |
31.3199 | 0.2530 | 700 | 2.2796 | 20.063 | 39.6827 | 85.9359 | 12.7247 |
30.5014 | 0.2892 | 800 | 2.2517 | 20.6695 | 40.3914 | 85.2728 | 12.7459 |
29.2522 | 0.3253 | 900 | 2.2143 | 21.2474 | 41.3276 | 82.497 | 12.5588 |
27.6444 | 0.3615 | 1000 | 2.1680 | 21.6217 | 41.2203 | 84.6141 | 12.8859 |
26.1025 | 0.3976 | 1100 | 2.1871 | 21.9405 | 41.8135 | 83.7077 | 12.8235 |
25.8736 | 0.4338 | 1200 | 2.1521 | 23.0621 | 42.5788 | 82.0175 | 12.7929 |
24.5748 | 0.4699 | 1300 | 2.1207 | 22.9369 | 42.4532 | 80.0937 | 12.3553 |
24.0533 | 0.5061 | 1400 | 2.1252 | 22.9731 | 42.9912 | 80.8733 | 12.4365 |
23.8092 | 0.5422 | 1500 | 2.1173 | 22.1319 | 42.3421 | 81.1727 | 12.3988 |
23.2533 | 0.5783 | 1600 | 2.1203 | 22.9435 | 42.7683 | 80.9051 | 12.4294 |
22.5494 | 0.6145 | 1700 | 2.1041 | 23.5743 | 43.1741 | 78.8658 | 12.44 |
22.4968 | 0.6506 | 1800 | 2.1143 | 24.0091 | 43.7091 | 79.2722 | 12.4471 |
21.8837 | 0.6868 | 1900 | 2.1086 | 23.5233 | 43.1374 | 83.0314 | 12.5106 |
21.4213 | 0.7229 | 2000 | 2.0802 | 23.4877 | 43.2273 | 80.5064 | 12.4506 |
21.0951 | 0.7591 | 2100 | 2.0604 | 23.8738 | 43.2852 | 83.3723 | 12.5212 |
20.861 | 0.7952 | 2200 | 2.0520 | 23.8624 | 43.4149 | 79.3725 | 12.46 |
20.164 | 0.8314 | 2300 | 2.0542 | 24.3772 | 43.6011 | 77.7732 | 12.4059 |
20.4746 | 0.8675 | 2400 | 2.0430 | 23.9582 | 43.1838 | 84.7154 | 12.5318 |
19.9062 | 0.9037 | 2500 | 2.0450 | 23.6104 | 43.4165 | 79.6449 | 12.4435 |
19.6233 | 0.9398 | 2600 | 2.0134 | 24.3974 | 44.2913 | 78.2905 | 12.3906 |
19.2435 | 0.9760 | 2700 | 2.0239 | 24.2622 | 44.0087 | 77.9154 | 12.3341 |
17.9055 | 1.0119 | 2800 | 2.0400 | 23.8797 | 43.6364 | 78.3719 | 12.3059 |
17.3272 | 1.0481 | 2900 | 2.0592 | 24.2694 | 43.7129 | 77.6987 | 12.3494 |
17.3509 | 1.0842 | 3000 | 2.0404 | 24.202 | 43.8023 | 78.4414 | 12.3882 |
17.4569 | 1.1204 | 3100 | 2.0216 | 24.254 | 43.937 | 77.67 | 12.4412 |
17.1146 | 1.1565 | 3200 | 2.0148 | 24.2023 | 44.4105 | 78.7747 | 12.4565 |
16.6774 | 1.1927 | 3300 | 2.0354 | 23.7494 | 43.796 | 79.1691 | 12.3871 |
17.0017 | 1.2288 | 3400 | 2.0189 | 24.2081 | 44.1283 | 77.7256 | 12.4047 |
16.6622 | 1.2650 | 3500 | 2.0140 | 24.217 | 44.2066 | 77.9338 | 12.38 |
16.5509 | 1.3011 | 3600 | 2.0312 | 24.3622 | 44.2009 | 78.341 | 12.4424 |
16.4062 | 1.3372 | 3700 | 2.0121 | 24.3472 | 44.0205 | 77.7659 | 12.4012 |
16.5797 | 1.3734 | 3800 | 2.0434 | 23.9545 | 43.8269 | 78.3792 | 12.4188 |
16.3888 | 1.4095 | 3900 | 2.0162 | 23.7386 | 43.79 | 78.9535 | 12.4576 |
16.0585 | 1.4457 | 4000 | 2.0166 | 23.9605 | 43.8616 | 79.1566 | 12.42 |
16.4091 | 1.4818 | 4100 | 2.0234 | 24.2887 | 44.2176 | 78.4537 | 12.4176 |
16.2554 | 1.5180 | 4200 | 2.0220 | 24.278 | 44.3497 | 77.9841 | 12.4553 |
16.1241 | 1.5541 | 4300 | 1.9927 | 24.9649 | 44.8357 | 77.048 | 12.3647 |
16.0068 | 1.5903 | 4400 | 2.0072 | 25.0727 | 44.8539 | 77.268 | 12.3729 |
15.7749 | 1.6264 | 4500 | 2.0135 | 24.4552 | 44.5047 | 78.7697 | 12.4294 |
15.9322 | 1.6626 | 4600 | 2.0011 | 24.7077 | 44.3159 | 78.0441 | 12.4153 |
15.8415 | 1.6987 | 4700 | 2.0028 | 24.2381 | 44.0753 | 78.4017 | 12.4247 |
15.544 | 1.7349 | 4800 | 2.0001 | 24.6574 | 44.4939 | 77.7002 | 12.4094 |
15.8296 | 1.7710 | 4900 | 1.9900 | 25.0563 | 45.2428 | 77.0852 | 12.3988 |
15.4856 | 1.8072 | 5000 | 1.9845 | 24.8863 | 44.87 | 77.323 | 12.3518 |
15.3165 | 1.8433 | 5100 | 1.9749 | 24.5669 | 44.8923 | 77.8361 | 12.3906 |
15.8505 | 1.8795 | 5200 | 1.9879 | 24.6842 | 44.8395 | 78.055 | 12.4165 |
15.2058 | 1.9156 | 5300 | 1.9728 | 24.8448 | 44.7121 | 77.4636 | 12.3529 |
15.1863 | 1.9517 | 5400 | 1.9831 | 25.0569 | 44.8317 | 77.37 | 12.3659 |
15.3051 | 1.9879 | 5500 | 1.9772 | 25.0335 | 45.0737 | 77.0836 | 12.3494 |
14.3992 | 2.0239 | 5600 | 2.0031 | 25.2894 | 45.2749 | 76.6897 | 12.3541 |
14.2126 | 2.0600 | 5700 | 1.9986 | 25.1707 | 45.1943 | 77.2161 | 12.3741 |
14.5549 | 2.0962 | 5800 | 2.0035 | 25.2928 | 45.129 | 77.1665 | 12.3576 |
14.4486 | 2.1323 | 5900 | 1.9989 | 25.0251 | 44.9919 | 77.573 | 12.3459 |
14.3505 | 2.1684 | 6000 | 1.9978 | 25.1946 | 44.9452 | 77.4272 | 12.3365 |
14.5783 | 2.2046 | 6100 | 2.0078 | 25.0654 | 44.8266 | 77.6317 | 12.3765 |
14.2261 | 2.2407 | 6200 | 2.0079 | 25.0622 | 44.788 | 77.5574 | 12.3729 |
14.3163 | 2.2769 | 6300 | 2.0104 | 25.0457 | 44.7239 | 78.0145 | 12.4 |
14.4612 | 2.3130 | 6400 | 2.0035 | 25.184 | 45.1284 | 77.4162 | 12.3741 |
14.1588 | 2.3492 | 6500 | 2.0060 | 24.9345 | 44.7316 | 77.8871 | 12.3824 |
14.3309 | 2.3853 | 6600 | 2.0102 | 25.2199 | 45.0809 | 77.634 | 12.3859 |
14.0311 | 2.4215 | 6700 | 2.0017 | 25.1946 | 44.8508 | 77.3804 | 12.3565 |
14.0974 | 2.4576 | 6800 | 2.0046 | 25.005 | 44.902 | 78.0266 | 12.3882 |
14.2057 | 2.4938 | 6900 | 2.0035 | 24.9742 | 45.0221 | 78.0698 | 12.3894 |
14.4113 | 2.5299 | 7000 | 2.0027 | 25.2538 | 45.0121 | 76.9214 | 12.3329 |
14.4146 | 2.5661 | 7100 | 2.0040 | 25.0706 | 44.7691 | 77.1347 | 12.3565 |
13.8906 | 2.6022 | 7200 | 2.0076 | 25.0349 | 44.6927 | 77.2741 | 12.3788 |
14.5977 | 2.6384 | 7300 | 2.0051 | 25.0629 | 44.7696 | 77.1818 | 12.3776 |
14.2155 | 2.6745 | 7400 | 2.0059 | 25.0825 | 44.8729 | 77.0082 | 12.36 |
14.0901 | 2.7106 | 7500 | 2.0040 | 25.125 | 44.8754 | 76.8288 | 12.3588 |
14.1026 | 2.7468 | 7600 | 2.0045 | 24.9539 | 44.6773 | 77.1001 | 12.3365 |
14.5714 | 2.7829 | 7700 | 2.0049 | 25.0891 | 44.7547 | 76.9704 | 12.3494 |
14.0298 | 2.8191 | 7800 | 2.0036 | 25.1397 | 44.8099 | 76.9669 | 12.3565 |
14.1156 | 2.8552 | 7900 | 2.0033 | 24.9858 | 44.6656 | 77.0585 | 12.3365 |
14.032 | 2.8914 | 8000 | 2.0036 | 25.0805 | 44.6327 | 77.1592 | 12.3471 |
14.0703 | 2.9275 | 8100 | 2.0033 | 24.9824 | 44.5966 | 77.2585 | 12.3471 |
14.157 | 2.9637 | 8200 | 2.0034 | 24.9877 | 44.6972 | 77.1984 | 12.3388 |
Framework versions
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.1.0
- Tokenizers 0.21.0
- Downloads last month
- 2
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
the model is not deployed on the HF Inference API.
Model tree for BounharAbdelaziz/Terjman-Ultra-v2.2
Base model
facebook/nllb-200-1.3B