impossible-llms-dutch-fronting-n

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 7.2382

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
83.6701 0.9180 7 10.1956
75.7086 1.9180 14 9.3449
72.2956 2.9180 21 8.9429
70.2558 3.9180 28 8.7347
68.9734 4.9180 35 8.5521
68.2176 5.9180 42 8.3859
66.0063 6.9180 49 8.2031
64.8254 7.9180 56 8.0173
63.0804 8.9180 63 7.8319
60.8456 9.9180 70 7.6351
59.923 10.9180 77 7.4411
58.8844 11.9180 84 7.2334
56.4646 12.9180 91 7.0356
55.2957 13.9180 98 6.8305
53.6488 14.9180 105 6.6407
52.1409 15.9180 112 6.4615
50.6388 16.9180 119 6.3001
49.3661 17.9180 126 6.1815
48.7792 18.9180 133 6.0784
47.8148 19.9180 140 5.9857
47.9591 20.9180 147 5.9544
47.3827 21.9180 154 5.8818
46.8284 22.9180 161 5.8467
46.4072 23.9180 168 5.8142
45.984 24.9180 175 5.7750
45.874 25.9180 182 5.7514
45.8507 26.9180 189 5.7228
45.7081 27.9180 196 5.7023
45.3109 28.9180 203 5.6814
45.2991 29.9180 210 5.6638
45.3284 30.9180 217 5.6439
44.874 31.9180 224 5.6230
44.245 32.9180 231 5.6115
44.459 33.9180 238 5.5926
44.1098 34.9180 245 5.5806
44.1613 35.9180 252 5.5706
44.0722 36.9180 259 5.5529
43.4508 37.9180 266 5.5428
43.882 38.9180 273 5.5250
43.4194 39.9180 280 5.5148
43.6083 40.9180 287 5.5060
43.3132 41.9180 294 5.4921
43.1959 42.9180 301 5.4813
43.1946 43.9180 308 5.4666
43.1663 44.9180 315 5.4574
42.9024 45.9180 322 5.4476
42.4793 46.9180 329 5.4333
42.3949 47.9180 336 5.4227
42.4094 48.9180 343 5.4119
42.3865 49.9180 350 5.3982
42.4174 50.9180 357 5.3901
41.8991 51.9180 364 5.3743
41.391 52.9180 371 5.3608
41.3463 53.9180 378 5.3533
40.8934 54.9180 385 5.3442
41.2137 55.9180 392 5.3380
40.8002 56.9180 399 5.3214
40.526 57.9180 406 5.3211
40.8136 58.9180 413 5.3158
40.4688 59.9180 420 5.3061
40.4979 60.9180 427 5.3024
39.9642 61.9180 434 5.2950
39.9373 62.9180 441 5.2918
39.5555 63.9180 448 5.2838
39.7243 64.9180 455 5.2777
39.693 65.9180 462 5.2666
39.475 66.9180 469 5.2668
39.2162 67.9180 476 5.2650
38.7789 68.9180 483 5.2657
38.9602 69.9180 490 5.2537
38.763 70.9180 497 5.2586
38.7509 71.9180 504 5.2554
38.5699 72.9180 511 5.2596
38.3162 73.9180 518 5.2482
38.0983 74.9180 525 5.2565
37.5655 75.9180 532 5.2545
38.3244 76.9180 539 5.2597
38.061 77.9180 546 5.2609
37.1937 78.9180 553 5.2649
37.4442 79.9180 560 5.2626
37.0761 80.9180 567 5.2609
37.0859 81.9180 574 5.2618
37.0919 82.9180 581 5.2651
36.7448 83.9180 588 5.2737
37.1852 84.9180 595 5.2719
36.3939 85.9180 602 5.2788
36.2088 86.9180 609 5.2963
35.7103 87.9180 616 5.2891
35.5618 88.9180 623 5.2932
36.0717 89.9180 630 5.3039
35.9256 90.9180 637 5.3101
35.4183 91.9180 644 5.3116
35.5809 92.9180 651 5.3253
35.4696 93.9180 658 5.3302
35.053 94.9180 665 5.3338
35.0064 95.9180 672 5.3376
35.079 96.9180 679 5.3470
34.7047 97.9180 686 5.3546
34.3325 98.9180 693 5.3750
34.0402 99.9180 700 5.3726
34.1036 100.9180 707 5.3827
34.1095 101.9180 714 5.4029
34.1152 102.9180 721 5.4032
33.7175 103.9180 728 5.4104
34.0196 104.9180 735 5.4260
33.6881 105.9180 742 5.4331
33.5922 106.9180 749 5.4362
33.3315 107.9180 756 5.4593
33.2476 108.9180 763 5.4699
32.8163 109.9180 770 5.4719
32.6824 110.9180 777 5.4882
32.5827 111.9180 784 5.4952
32.6458 112.9180 791 5.5115
32.2005 113.9180 798 5.5122
32.2591 114.9180 805 5.5220
31.9897 115.9180 812 5.5387
31.8702 116.9180 819 5.5453
31.6531 117.9180 826 5.5683
31.3954 118.9180 833 5.5624
31.7054 119.9180 840 5.5764
31.4914 120.9180 847 5.5936
31.3499 121.9180 854 5.6102
31.195 122.9180 861 5.6120
30.9096 123.9180 868 5.6217
30.7157 124.9180 875 5.6366
30.5331 125.9180 882 5.6528
30.5613 126.9180 889 5.6834
30.2145 127.9180 896 5.6852
30.113 128.9180 903 5.6898
30.1151 129.9180 910 5.7056
29.9459 130.9180 917 5.7215
29.5514 131.9180 924 5.7287
29.8315 132.9180 931 5.7279
29.5987 133.9180 938 5.7456
29.3225 134.9180 945 5.7527
29.2781 135.9180 952 5.7717
29.0682 136.9180 959 5.7949
29.0572 137.9180 966 5.8020
28.9202 138.9180 973 5.8055
28.6232 139.9180 980 5.8158
28.6747 140.9180 987 5.8414
28.3709 141.9180 994 5.8472
28.2324 142.9180 1001 5.8516
28.0045 143.9180 1008 5.8674
27.9814 144.9180 1015 5.8802
27.9927 145.9180 1022 5.8953
27.9142 146.9180 1029 5.9083
27.9523 147.9180 1036 5.9147
27.7392 148.9180 1043 5.9287
27.4822 149.9180 1050 5.9309
27.2515 150.9180 1057 5.9551
27.2118 151.9180 1064 5.9693
26.8956 152.9180 1071 5.9823
27.1035 153.9180 1078 5.9958
26.6303 154.9180 1085 5.9958
26.7365 155.9180 1092 6.0043
26.4714 156.9180 1099 6.0132
26.4539 157.9180 1106 6.0331
26.3736 158.9180 1113 6.0507
26.1776 159.9180 1120 6.0525
25.9563 160.9180 1127 6.0720
25.9868 161.9180 1134 6.0791
25.7172 162.9180 1141 6.0920
25.7482 163.9180 1148 6.1069
25.7614 164.9180 1155 6.1168
25.4299 165.9180 1162 6.1296
25.4207 166.9180 1169 6.1330
25.3818 167.9180 1176 6.1523
25.1478 168.9180 1183 6.1641
24.9907 169.9180 1190 6.1770
24.9778 170.9180 1197 6.1889
24.8723 171.9180 1204 6.2013
24.647 172.9180 1211 6.2126
24.622 173.9180 1218 6.2132
24.4667 174.9180 1225 6.2380
24.2671 175.9180 1232 6.2411
24.3566 176.9180 1239 6.2497
24.0931 177.9180 1246 6.2572
24.0716 178.9180 1253 6.2773
23.8071 179.9180 1260 6.2888
23.8775 180.9180 1267 6.2936
23.7108 181.9180 1274 6.3013
23.5933 182.9180 1281 6.3154
23.4123 183.9180 1288 6.3281
23.5153 184.9180 1295 6.3312
23.2879 185.9180 1302 6.3452
23.3347 186.9180 1309 6.3558
23.1568 187.9180 1316 6.3614
22.9392 188.9180 1323 6.3771
23.0089 189.9180 1330 6.3784
22.8168 190.9180 1337 6.3908
22.8451 191.9180 1344 6.3943
22.5212 192.9180 1351 6.4143
22.564 193.9180 1358 6.4124
22.5765 194.9180 1365 6.4407
22.4188 195.9180 1372 6.4379
22.332 196.9180 1379 6.4485
22.1724 197.9180 1386 6.4655
22.1624 198.9180 1393 6.4672
22.1432 199.9180 1400 6.4842
21.9701 200.9180 1407 6.4928
21.9176 201.9180 1414 6.4992
21.9843 202.9180 1421 6.4977
21.7467 203.9180 1428 6.5226
21.6995 204.9180 1435 6.5158
21.5824 205.9180 1442 6.5391
21.4719 206.9180 1449 6.5345
21.513 207.9180 1456 6.5496
21.2908 208.9180 1463 6.5533
21.371 209.9180 1470 6.5709
21.3238 210.9180 1477 6.5808
21.0399 211.9180 1484 6.5865
21.0396 212.9180 1491 6.5821
21.0588 213.9180 1498 6.5931
20.8786 214.9180 1505 6.6018
20.8811 215.9180 1512 6.6110
20.7742 216.9180 1519 6.6254
20.6929 217.9180 1526 6.6402
20.5138 218.9180 1533 6.6344
20.5602 219.9180 1540 6.6518
20.5375 220.9180 1547 6.6601
20.2728 221.9180 1554 6.6587
20.2767 222.9180 1561 6.6601
20.3602 223.9180 1568 6.6751
20.0776 224.9180 1575 6.6863
20.118 225.9180 1582 6.6826
20.064 226.9180 1589 6.6926
19.9295 227.9180 1596 6.7076
20.0506 228.9180 1603 6.7062
19.8346 229.9180 1610 6.7225
19.9416 230.9180 1617 6.7310
19.7974 231.9180 1624 6.7296
19.6642 232.9180 1631 6.7407
19.6463 233.9180 1638 6.7568
19.6482 234.9180 1645 6.7512
19.5956 235.9180 1652 6.7669
19.4431 236.9180 1659 6.7735
19.3011 237.9180 1666 6.7646
19.3428 238.9180 1673 6.7841
19.2815 239.9180 1680 6.7967
19.2098 240.9180 1687 6.7865
19.1405 241.9180 1694 6.8024
19.1654 242.9180 1701 6.8057
19.0602 243.9180 1708 6.8148
18.9197 244.9180 1715 6.8276
18.9563 245.9180 1722 6.8290
18.8658 246.9180 1729 6.8274
18.7788 247.9180 1736 6.8348
18.7779 248.9180 1743 6.8503
18.7738 249.9180 1750 6.8421
18.7014 250.9180 1757 6.8532
18.7032 251.9180 1764 6.8603
18.6893 252.9180 1771 6.8675
18.6286 253.9180 1778 6.8717
18.4156 254.9180 1785 6.8790
18.5478 255.9180 1792 6.8812
18.4077 256.9180 1799 6.8913
18.317 257.9180 1806 6.8902
18.3641 258.9180 1813 6.8969
18.255 259.9180 1820 6.9077
18.2521 260.9180 1827 6.9164
18.1424 261.9180 1834 6.9142
18.1594 262.9180 1841 6.9232
18.0446 263.9180 1848 6.9272
17.9396 264.9180 1855 6.9296
17.8788 265.9180 1862 6.9328
18.0324 266.9180 1869 6.9364
17.9351 267.9180 1876 6.9460
17.9139 268.9180 1883 6.9463
17.8015 269.9180 1890 6.9509
17.8447 270.9180 1897 6.9670
17.8626 271.9180 1904 6.9519
17.7717 272.9180 1911 6.9759
17.5444 273.9180 1918 6.9762
17.6437 274.9180 1925 6.9767
17.5875 275.9180 1932 6.9895
17.5209 276.9180 1939 6.9913
17.5435 277.9180 1946 6.9944
17.4605 278.9180 1953 6.9932
17.5248 279.9180 1960 7.0018
17.4359 280.9180 1967 7.0021
17.3026 281.9180 1974 7.0135
17.3997 282.9180 1981 7.0143
17.3819 283.9180 1988 7.0109
17.2925 284.9180 1995 7.0209
17.3533 285.9180 2002 7.0310
17.2202 286.9180 2009 7.0301
17.1709 287.9180 2016 7.0267
17.2149 288.9180 2023 7.0373
17.2066 289.9180 2030 7.0373
17.1161 290.9180 2037 7.0458
17.1385 291.9180 2044 7.0494
17.1014 292.9180 2051 7.0507
17.0271 293.9180 2058 7.0579
17.0832 294.9180 2065 7.0550
16.9456 295.9180 2072 7.0605
16.9715 296.9180 2079 7.0636
16.8997 297.9180 2086 7.0760
16.8188 298.9180 2093 7.0707
16.8373 299.9180 2100 7.0735
16.8626 300.9180 2107 7.0730
16.7618 301.9180 2114 7.0811
16.8143 302.9180 2121 7.0819
16.7815 303.9180 2128 7.0868
16.6802 304.9180 2135 7.0916
16.7722 305.9180 2142 7.0921
16.7145 306.9180 2149 7.1001
16.5915 307.9180 2156 7.0993
16.5851 308.9180 2163 7.0998
16.6385 309.9180 2170 7.1097
16.529 310.9180 2177 7.1086
16.5651 311.9180 2184 7.1133
16.5234 312.9180 2191 7.1117
16.5097 313.9180 2198 7.1135
16.5259 314.9180 2205 7.1232
16.4198 315.9180 2212 7.1231
16.4323 316.9180 2219 7.1330
16.377 317.9180 2226 7.1289
16.492 318.9180 2233 7.1288
16.3722 319.9180 2240 7.1322
16.2743 320.9180 2247 7.1321
16.4259 321.9180 2254 7.1439
16.4179 322.9180 2261 7.1404
16.3535 323.9180 2268 7.1443
16.343 324.9180 2275 7.1461
16.3196 325.9180 2282 7.1412
16.2085 326.9180 2289 7.1510
16.2591 327.9180 2296 7.1544
16.2155 328.9180 2303 7.1535
16.1526 329.9180 2310 7.1581
16.2035 330.9180 2317 7.1620
16.165 331.9180 2324 7.1615
16.191 332.9180 2331 7.1611
16.1696 333.9180 2338 7.1646
16.1484 334.9180 2345 7.1680
16.1228 335.9180 2352 7.1665
16.0788 336.9180 2359 7.1705
16.0677 337.9180 2366 7.1766
16.0622 338.9180 2373 7.1777
16.033 339.9180 2380 7.1717
16.0733 340.9180 2387 7.1757
16.0157 341.9180 2394 7.1818
16.0095 342.9180 2401 7.1789
15.94 343.9180 2408 7.1859
15.9847 344.9180 2415 7.1874
15.8697 345.9180 2422 7.1878
15.9282 346.9180 2429 7.1925
16.0203 347.9180 2436 7.1898
15.9902 348.9180 2443 7.1898
15.8958 349.9180 2450 7.1949
15.9287 350.9180 2457 7.1970
15.9108 351.9180 2464 7.1993
15.8696 352.9180 2471 7.1988
15.8532 353.9180 2478 7.1968
15.8451 354.9180 2485 7.2027
15.8597 355.9180 2492 7.2006
15.8685 356.9180 2499 7.2004
15.85 357.9180 2506 7.2044
15.8046 358.9180 2513 7.2043
15.8863 359.9180 2520 7.2060
15.8021 360.9180 2527 7.2076
15.7762 361.9180 2534 7.2055
15.7365 362.9180 2541 7.2099
15.7285 363.9180 2548 7.2120
15.6606 364.9180 2555 7.2130
15.7807 365.9180 2562 7.2140
15.7585 366.9180 2569 7.2162
15.6561 367.9180 2576 7.2127
15.792 368.9180 2583 7.2180
15.7455 369.9180 2590 7.2173
15.6825 370.9180 2597 7.2176
15.7242 371.9180 2604 7.2231
15.7076 372.9180 2611 7.2235
15.6556 373.9180 2618 7.2210
15.6672 374.9180 2625 7.2204
15.7014 375.9180 2632 7.2212
15.6099 376.9180 2639 7.2224
15.6903 377.9180 2646 7.2277
15.6465 378.9180 2653 7.2243
15.6777 379.9180 2660 7.2227
15.6024 380.9180 2667 7.2267
15.6441 381.9180 2674 7.2263
15.6612 382.9180 2681 7.2293
15.5736 383.9180 2688 7.2294
15.6272 384.9180 2695 7.2294
15.638 385.9180 2702 7.2301
15.6038 386.9180 2709 7.2301
15.599 387.9180 2716 7.2307
15.6235 388.9180 2723 7.2289
15.5048 389.9180 2730 7.2326
15.6601 390.9180 2737 7.2320
15.6125 391.9180 2744 7.2335
15.59 392.9180 2751 7.2305
15.6286 393.9180 2758 7.2328
15.5775 394.9180 2765 7.2340
15.6249 395.9180 2772 7.2338
15.6088 396.9180 2779 7.2346
15.5337 397.9180 2786 7.2347
15.6563 398.9180 2793 7.2332
15.5707 399.9180 2800 7.2343
15.5497 400.9180 2807 7.2363
15.5564 401.9180 2814 7.2349
15.5751 402.9180 2821 7.2358
15.5842 403.9180 2828 7.2362
15.5886 404.9180 2835 7.2366
15.5082 405.9180 2842 7.2367
15.5615 406.9180 2849 7.2362
15.5661 407.9180 2856 7.2367
15.5817 408.9180 2863 7.2378
15.5737 409.9180 2870 7.2378
15.5527 410.9180 2877 7.2374
15.5545 411.9180 2884 7.2375
15.6062 412.9180 2891 7.2375
15.6178 413.9180 2898 7.2378
15.5732 414.9180 2905 7.2382
15.5323 415.9180 2912 7.2383
15.5337 416.9180 2919 7.2383
15.5829 417.9180 2926 7.2382
15.6258 418.9180 2933 7.2381
15.546 419.9180 2940 7.2382
15.5106 420.9180 2947 7.2382
15.5473 421.9180 2954 7.2382
15.5618 422.9180 2961 7.2382
15.5927 423.9180 2968 7.2382
15.5168 424.9180 2975 7.2382
15.5718 425.9180 2982 7.2382
15.5793 426.9180 2989 7.2382
15.5687 427.9180 2996 7.2382
15.5714 428.5246 3000 7.2382

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-dutch-fronting-n