impossible-llms-dutch-natural

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.8511

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
83.7117 0.9333 7 10.2143
75.8934 1.9333 14 9.4048
72.9415 2.9333 21 9.0186
70.9719 3.9333 28 8.8230
69.8879 4.9333 35 8.6578
68.3539 5.9333 42 8.4637
67.0209 6.9333 49 8.2657
64.7796 7.9333 56 8.0714
63.6467 8.9333 63 7.8540
61.6167 9.9333 70 7.6384
60.047 10.9333 77 7.4229
58.3381 11.9333 84 7.2187
56.4005 12.9333 91 7.0145
54.9344 13.9333 98 6.8051
53.2455 14.9333 105 6.6022
51.8781 15.9333 112 6.4209
49.9611 16.9333 119 6.2543
48.9698 17.9333 126 6.1229
48.3432 18.9333 133 5.9939
47.4903 19.9333 140 5.9090
46.4999 20.9333 147 5.8312
46.0457 21.9333 154 5.7889
45.7769 22.9333 161 5.7287
45.4704 23.9333 168 5.6897
45.3076 24.9333 175 5.6573
44.8951 25.9333 182 5.6173
44.4198 26.9333 189 5.5830
44.2616 27.9333 196 5.5655
44.2027 28.9333 203 5.5388
43.7259 29.9333 210 5.5077
43.4983 30.9333 217 5.4934
43.4004 31.9333 224 5.4657
43.3279 32.9333 231 5.4449
43.1174 33.9333 238 5.4298
43.0299 34.9333 245 5.4237
42.802 35.9333 252 5.3982
42.6775 36.9333 259 5.3848
42.5224 37.9333 266 5.3695
42.5592 38.9333 273 5.3576
41.9492 39.9333 280 5.3499
41.7719 40.9333 287 5.3304
41.6114 41.9333 294 5.3186
41.7619 42.9333 301 5.3047
41.4263 43.9333 308 5.2879
41.2317 44.9333 315 5.2730
41.1587 45.9333 322 5.2632
40.8852 46.9333 329 5.2574
40.7323 47.9333 336 5.2376
40.5239 48.9333 343 5.2260
40.5716 49.9333 350 5.2140
40.0775 50.9333 357 5.1992
39.855 51.9333 364 5.1946
40.0483 52.9333 371 5.1892
40.159 53.9333 378 5.1764
39.6859 54.9333 385 5.1607
39.6553 55.9333 392 5.1519
39.5134 56.9333 399 5.1354
38.9637 57.9333 406 5.1230
39.3161 58.9333 413 5.1158
38.6704 59.9333 420 5.1091
38.7079 60.9333 427 5.1005
38.6312 61.9333 434 5.0862
38.4287 62.9333 441 5.0821
37.8619 63.9333 448 5.0755
38.1086 64.9333 455 5.0707
37.8793 65.9333 462 5.0616
37.6437 66.9333 469 5.0473
37.49 67.9333 476 5.0440
37.1884 68.9333 483 5.0390
37.1471 69.9333 490 5.0462
37.0321 70.9333 497 5.0332
36.6673 71.9333 504 5.0255
36.7063 72.9333 511 5.0264
36.5293 73.9333 518 5.0250
36.3218 74.9333 525 5.0115
36.3916 75.9333 532 5.0157
36.1176 76.9333 539 5.0202
36.0567 77.9333 546 5.0154
35.833 78.9333 553 5.0120
35.5794 79.9333 560 5.0102
35.4828 80.9333 567 5.0079
35.3006 81.9333 574 5.0116
35.1394 82.9333 581 5.0242
35.0031 83.9333 588 5.0175
35.0788 84.9333 595 5.0178
34.7668 85.9333 602 5.0223
34.5778 86.9333 609 5.0189
34.5011 87.9333 616 5.0303
34.2244 88.9333 623 5.0307
34.0421 89.9333 630 5.0430
34.1727 90.9333 637 5.0479
34.0275 91.9333 644 5.0366
33.7746 92.9333 651 5.0507
33.532 93.9333 658 5.0585
33.2973 94.9333 665 5.0688
33.1983 95.9333 672 5.0686
32.939 96.9333 679 5.0730
32.677 97.9333 686 5.0768
32.7291 98.9333 693 5.0896
32.5378 99.9333 700 5.0950
32.4795 100.9333 707 5.1006
31.9008 101.9333 714 5.1056
32.4762 102.9333 721 5.1182
31.9937 103.9333 728 5.1289
31.728 104.9333 735 5.1354
31.534 105.9333 742 5.1430
31.5226 106.9333 749 5.1590
31.4869 107.9333 756 5.1636
31.3191 108.9333 763 5.1758
30.9727 109.9333 770 5.1863
31.2655 110.9333 777 5.1950
30.6855 111.9333 784 5.2014
30.4679 112.9333 791 5.2081
30.6404 113.9333 798 5.2228
30.3814 114.9333 805 5.2198
30.235 115.9333 812 5.2372
30.0568 116.9333 819 5.2530
29.997 117.9333 826 5.2508
29.8703 118.9333 833 5.2798
29.6953 119.9333 840 5.2838
29.5049 120.9333 847 5.2843
29.3829 121.9333 854 5.3037
29.1048 122.9333 861 5.3074
29.2621 123.9333 868 5.3259
28.8765 124.9333 875 5.3395
28.721 125.9333 882 5.3465
28.8328 126.9333 889 5.3552
28.3896 127.9333 896 5.3636
28.4245 128.9333 903 5.3890
28.4029 129.9333 910 5.3924
27.8876 130.9333 917 5.4017
27.8653 131.9333 924 5.4139
27.7764 132.9333 931 5.4197
27.7357 133.9333 938 5.4274
27.4753 134.9333 945 5.4368
27.4625 135.9333 952 5.4565
27.2527 136.9333 959 5.4666
27.1109 137.9333 966 5.4814
27.0542 138.9333 973 5.4973
26.8748 139.9333 980 5.4976
26.7482 140.9333 987 5.5145
26.676 141.9333 994 5.5362
26.6701 142.9333 1001 5.5322
26.3176 143.9333 1008 5.5522
26.3026 144.9333 1015 5.5741
26.0128 145.9333 1022 5.5719
26.0076 146.9333 1029 5.5842
25.9651 147.9333 1036 5.5956
25.742 148.9333 1043 5.6074
25.8191 149.9333 1050 5.6202
25.7981 150.9333 1057 5.6331
25.5103 151.9333 1064 5.6376
25.3168 152.9333 1071 5.6537
25.2439 153.9333 1078 5.6608
25.2151 154.9333 1085 5.6766
25.1006 155.9333 1092 5.6852
24.8561 156.9333 1099 5.6979
24.6449 157.9333 1106 5.7047
24.6927 158.9333 1113 5.7205
24.5605 159.9333 1120 5.7308
24.4753 160.9333 1127 5.7364
24.2769 161.9333 1134 5.7492
24.1855 162.9333 1141 5.7606
24.0746 163.9333 1148 5.7705
24.0852 164.9333 1155 5.7824
23.8601 165.9333 1162 5.8000
23.9199 166.9333 1169 5.8024
23.7714 167.9333 1176 5.8088
23.5791 168.9333 1183 5.8248
23.4455 169.9333 1190 5.8391
23.3086 170.9333 1197 5.8409
23.212 171.9333 1204 5.8697
23.141 172.9333 1211 5.8701
23.1293 173.9333 1218 5.8782
23.0579 174.9333 1225 5.8770
22.9849 175.9333 1232 5.8925
22.7807 176.9333 1239 5.9106
22.7131 177.9333 1246 5.9296
22.6021 178.9333 1253 5.9321
22.5123 179.9333 1260 5.9408
22.4124 180.9333 1267 5.9475
22.3329 181.9333 1274 5.9554
22.2649 182.9333 1281 5.9708
22.1885 183.9333 1288 5.9813
22.0931 184.9333 1295 6.0005
21.9846 185.9333 1302 6.0029
21.8881 186.9333 1309 6.0112
21.774 187.9333 1316 6.0146
21.7439 188.9333 1323 6.0244
21.5759 189.9333 1330 6.0419
21.4961 190.9333 1337 6.0511
21.4368 191.9333 1344 6.0507
21.3496 192.9333 1351 6.0586
21.4152 193.9333 1358 6.0781
21.2655 194.9333 1365 6.0945
21.1975 195.9333 1372 6.0955
21.0355 196.9333 1379 6.0984
21.052 197.9333 1386 6.1121
20.9454 198.9333 1393 6.1211
20.8366 199.9333 1400 6.1329
20.6834 200.9333 1407 6.1370
20.6987 201.9333 1414 6.1536
20.5341 202.9333 1421 6.1587
20.4992 203.9333 1428 6.1699
20.4397 204.9333 1435 6.1767
20.3129 205.9333 1442 6.1777
20.3108 206.9333 1449 6.1896
20.2516 207.9333 1456 6.1985
20.1814 208.9333 1463 6.2021
20.0902 209.9333 1470 6.2021
20.0063 210.9333 1477 6.2181
20.0338 211.9333 1484 6.2263
19.9134 212.9333 1491 6.2317
19.8247 213.9333 1498 6.2520
19.7804 214.9333 1505 6.2625
19.6632 215.9333 1512 6.2625
19.6614 216.9333 1519 6.2620
19.5803 217.9333 1526 6.2757
19.601 218.9333 1533 6.2905
19.47 219.9333 1540 6.2835
19.3792 220.9333 1547 6.3060
19.3706 221.9333 1554 6.3025
19.3596 222.9333 1561 6.3028
19.2916 223.9333 1568 6.3250
19.0939 224.9333 1575 6.3325
19.0507 225.9333 1582 6.3300
18.9941 226.9333 1589 6.3360
18.9527 227.9333 1596 6.3464
18.9254 228.9333 1603 6.3518
18.881 229.9333 1610 6.3612
18.8375 230.9333 1617 6.3716
18.7364 231.9333 1624 6.3730
18.7588 232.9333 1631 6.3804
18.6872 233.9333 1638 6.3904
18.5043 234.9333 1645 6.3959
18.5269 235.9333 1652 6.4098
18.4956 236.9333 1659 6.4009
18.3694 237.9333 1666 6.4040
18.3691 238.9333 1673 6.4196
18.373 239.9333 1680 6.4273
18.2059 240.9333 1687 6.4339
18.1729 241.9333 1694 6.4358
18.1035 242.9333 1701 6.4471
18.1063 243.9333 1708 6.4549
18.0354 244.9333 1715 6.4556
18.0954 245.9333 1722 6.4605
17.9759 246.9333 1729 6.4666
17.9287 247.9333 1736 6.4679
17.8844 248.9333 1743 6.4790
17.8503 249.9333 1750 6.4797
17.833 250.9333 1757 6.4918
17.8345 251.9333 1764 6.4980
17.7298 252.9333 1771 6.5028
17.7112 253.9333 1778 6.5058
17.6014 254.9333 1785 6.5058
17.5748 255.9333 1792 6.5132
17.5118 256.9333 1799 6.5248
17.4779 257.9333 1806 6.5269
17.4426 258.9333 1813 6.5360
17.3595 259.9333 1820 6.5388
17.3976 260.9333 1827 6.5465
17.354 261.9333 1834 6.5450
17.3396 262.9333 1841 6.5533
17.3185 263.9333 1848 6.5587
17.2999 264.9333 1855 6.5605
17.1613 265.9333 1862 6.5658
17.1494 266.9333 1869 6.5647
17.1055 267.9333 1876 6.5807
17.0425 268.9333 1883 6.5801
17.0779 269.9333 1890 6.5896
17.0865 270.9333 1897 6.5849
16.9909 271.9333 1904 6.5941
16.919 272.9333 1911 6.5959
16.8923 273.9333 1918 6.6059
16.8681 274.9333 1925 6.6116
16.877 275.9333 1932 6.6085
16.7917 276.9333 1939 6.6148
16.7425 277.9333 1946 6.6182
16.7017 278.9333 1953 6.6195
16.703 279.9333 1960 6.6293
16.6371 280.9333 1967 6.6392
16.667 281.9333 1974 6.6384
16.6195 282.9333 1981 6.6404
16.5782 283.9333 1988 6.6470
16.5873 284.9333 1995 6.6382
16.4938 285.9333 2002 6.6575
16.516 286.9333 2009 6.6537
16.4256 287.9333 2016 6.6551
16.453 288.9333 2023 6.6677
16.449 289.9333 2030 6.6590
16.3485 290.9333 2037 6.6700
16.2829 291.9333 2044 6.6727
16.3498 292.9333 2051 6.6731
16.2545 293.9333 2058 6.6766
16.2536 294.9333 2065 6.6773
16.2761 295.9333 2072 6.6860
16.3016 296.9333 2079 6.6853
16.2337 297.9333 2086 6.6959
16.1811 298.9333 2093 6.6886
16.1246 299.9333 2100 6.6968
16.1425 300.9333 2107 6.6991
16.138 301.9333 2114 6.7057
16.1108 302.9333 2121 6.7032
16.0679 303.9333 2128 6.7112
16.0574 304.9333 2135 6.7171
16.067 305.9333 2142 6.7158
16.1086 306.9333 2149 6.7224
16.0117 307.9333 2156 6.7170
15.9217 308.9333 2163 6.7266
15.9784 309.9333 2170 6.7249
15.9387 310.9333 2177 6.7285
15.841 311.9333 2184 6.7353
15.8259 312.9333 2191 6.7388
15.8594 313.9333 2198 6.7376
15.8134 314.9333 2205 6.7360
15.8162 315.9333 2212 6.7460
15.8048 316.9333 2219 6.7403
15.7763 317.9333 2226 6.7476
15.716 318.9333 2233 6.7495
15.7419 319.9333 2240 6.7535
15.706 320.9333 2247 6.7569
15.6611 321.9333 2254 6.7568
15.6439 322.9333 2261 6.7577
15.6362 323.9333 2268 6.7631
15.6508 324.9333 2275 6.7637
15.6272 325.9333 2282 6.7685
15.6159 326.9333 2289 6.7728
15.5731 327.9333 2296 6.7677
15.5812 328.9333 2303 6.7738
15.5914 329.9333 2310 6.7719
15.588 330.9333 2317 6.7750
15.5489 331.9333 2324 6.7799
15.5517 332.9333 2331 6.7819
15.4816 333.9333 2338 6.7827
15.5063 334.9333 2345 6.7873
15.5117 335.9333 2352 6.7885
15.4582 336.9333 2359 6.7858
15.4405 337.9333 2366 6.7894
15.4987 338.9333 2373 6.7879
15.4376 339.9333 2380 6.7912
15.4524 340.9333 2387 6.7956
15.4557 341.9333 2394 6.7970
15.3962 342.9333 2401 6.8007
15.3762 343.9333 2408 6.7994
15.4032 344.9333 2415 6.8040
15.2926 345.9333 2422 6.8019
15.3382 346.9333 2429 6.8010
15.3554 347.9333 2436 6.8021
15.3109 348.9333 2443 6.8045
15.3026 349.9333 2450 6.8071
15.3385 350.9333 2457 6.8077
15.2695 351.9333 2464 6.8104
15.2557 352.9333 2471 6.8097
15.3136 353.9333 2478 6.8110
15.3128 354.9333 2485 6.8143
15.2277 355.9333 2492 6.8157
15.2443 356.9333 2499 6.8171
15.2306 357.9333 2506 6.8167
15.2323 358.9333 2513 6.8212
15.227 359.9333 2520 6.8229
15.1965 360.9333 2527 6.8220
15.2124 361.9333 2534 6.8256
15.2447 362.9333 2541 6.8281
15.2188 363.9333 2548 6.8242
15.2149 364.9333 2555 6.8269
15.189 365.9333 2562 6.8263
15.1822 366.9333 2569 6.8247
15.1585 367.9333 2576 6.8299
15.1288 368.9333 2583 6.8310
15.1305 369.9333 2590 6.8304
15.1751 370.9333 2597 6.8320
15.0834 371.9333 2604 6.8311
15.0894 372.9333 2611 6.8342
15.1363 373.9333 2618 6.8359
15.1114 374.9333 2625 6.8361
15.0978 375.9333 2632 6.8353
15.1166 376.9333 2639 6.8377
15.1109 377.9333 2646 6.8386
15.0895 378.9333 2653 6.8386
15.0978 379.9333 2660 6.8386
15.0366 380.9333 2667 6.8369
15.0321 381.9333 2674 6.8403
15.0732 382.9333 2681 6.8402
15.0812 383.9333 2688 6.8401
15.0625 384.9333 2695 6.8403
15.0669 385.9333 2702 6.8459
15.0066 386.9333 2709 6.8436
15.0144 387.9333 2716 6.8440
15.0914 388.9333 2723 6.8461
15.0423 389.9333 2730 6.8441
15.0375 390.9333 2737 6.8455
15.0815 391.9333 2744 6.8473
15.0449 392.9333 2751 6.8475
15.0294 393.9333 2758 6.8465
15.0335 394.9333 2765 6.8455
14.9865 395.9333 2772 6.8454
15.0043 396.9333 2779 6.8469
15.0336 397.9333 2786 6.8478
15.015 398.9333 2793 6.8482
15.0131 399.9333 2800 6.8479
15.015 400.9333 2807 6.8485
15.0506 401.9333 2814 6.8496
15.0307 402.9333 2821 6.8500
14.9824 403.9333 2828 6.8496
15.0234 404.9333 2835 6.8495
14.9565 405.9333 2842 6.8489
14.9872 406.9333 2849 6.8494
14.9977 407.9333 2856 6.8495
15.0493 408.9333 2863 6.8503
14.9773 409.9333 2870 6.8512
14.9969 410.9333 2877 6.8514
14.9251 411.9333 2884 6.8507
15.0114 412.9333 2891 6.8506
15.0524 413.9333 2898 6.8507
15.0231 414.9333 2905 6.8506
15.0035 415.9333 2912 6.8506
14.9898 416.9333 2919 6.8508
14.9573 417.9333 2926 6.8511
14.9958 418.9333 2933 6.8513
14.986 419.9333 2940 6.8513
14.987 420.9333 2947 6.8513
15.0146 421.9333 2954 6.8512
15.0282 422.9333 2961 6.8511
14.9692 423.9333 2968 6.8511
14.9311 424.9333 2975 6.8511
14.9988 425.9333 2982 6.8511
14.9601 426.9333 2989 6.8511
14.9947 427.9333 2996 6.8511
15.0288 428.5333 3000 6.8511

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-dutch-natural