impossible-llms-dutch-mirror-reversal

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.7002

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
30.9103 1.0 8 10.0319
28.056 2.0 16 9.2287
26.9969 3.0 24 8.8411
26.6147 4.0 32 8.6371
25.6641 5.0 40 8.4328
25.145 6.0 48 8.2174
24.3982 7.0 56 7.9810
23.6285 8.0 64 7.7355
23.0379 9.0 72 7.5022
21.9772 10.0 80 7.2703
21.2915 11.0 88 7.0383
20.7598 12.0 96 6.8205
20.001 13.0 104 6.5945
19.2441 14.0 112 6.3844
18.8406 15.0 120 6.2005
18.3274 16.0 128 6.0524
18.0617 17.0 136 5.9309
17.7874 18.0 144 5.8501
17.3244 19.0 152 5.7919
17.4684 20.0 160 5.7451
17.126 21.0 168 5.7109
17.0529 22.0 176 5.6812
16.8776 23.0 184 5.6545
16.7451 24.0 192 5.6428
16.6969 25.0 200 5.6159
16.7645 26.0 208 5.5988
16.6636 27.0 216 5.5790
16.8546 28.0 224 5.5557
16.655 29.0 232 5.5410
16.5612 30.0 240 5.5281
16.4211 31.0 248 5.5111
16.2534 32.0 256 5.4998
16.4004 33.0 264 5.4877
16.4311 34.0 272 5.4818
16.3186 35.0 280 5.4645
16.1832 36.0 288 5.4590
16.2574 37.0 296 5.4363
15.9145 38.0 304 5.4296
15.9121 39.0 312 5.4081
16.0667 40.0 320 5.3968
15.5669 41.0 328 5.3760
15.8477 42.0 336 5.3693
15.8478 43.0 344 5.3453
15.5125 44.0 352 5.3285
15.403 45.0 360 5.3063
15.4521 46.0 368 5.2922
15.312 47.0 376 5.2817
15.3821 48.0 384 5.2509
15.2664 49.0 392 5.2429
15.359 50.0 400 5.2184
15.1581 51.0 408 5.2070
15.2058 52.0 416 5.1872
15.01 53.0 424 5.1730
15.0691 54.0 432 5.1583
14.6935 55.0 440 5.1488
14.864 56.0 448 5.1295
14.5825 57.0 456 5.1234
14.5543 58.0 464 5.1105
14.613 59.0 472 5.0951
14.3289 60.0 480 5.0960
14.4674 61.0 488 5.0732
14.3531 62.0 496 5.0663
14.1359 63.0 504 5.0580
14.1938 64.0 512 5.0477
14.1091 65.0 520 5.0443
14.2239 66.0 528 5.0391
13.8412 67.0 536 5.0281
14.0587 68.0 544 5.0242
13.9094 69.0 552 5.0240
13.7833 70.0 560 5.0148
13.816 71.0 568 5.0117
13.7985 72.0 576 5.0107
13.4309 73.0 584 5.0089
13.5499 74.0 592 5.0028
13.4984 75.0 600 5.0051
13.4116 76.0 608 5.0023
13.3341 77.0 616 4.9994
13.3634 78.0 624 4.9970
13.313 79.0 632 5.0048
13.0419 80.0 640 5.0065
13.1284 81.0 648 5.0114
12.8764 82.0 656 5.0064
13.0617 83.0 664 5.0058
12.8495 84.0 672 5.0158
12.7067 85.0 680 5.0177
12.8062 86.0 688 5.0247
12.7793 87.0 696 5.0345
12.636 88.0 704 5.0247
12.4628 89.0 712 5.0373
12.6478 90.0 720 5.0371
12.5413 91.0 728 5.0627
12.6034 92.0 736 5.0514
12.4614 93.0 744 5.0552
12.3867 94.0 752 5.0707
12.2639 95.0 760 5.0782
12.2274 96.0 768 5.0841
12.2469 97.0 776 5.0801
12.1749 98.0 784 5.1052
11.8314 99.0 792 5.1057
12.0894 100.0 800 5.1163
11.8745 101.0 808 5.1183
11.8622 102.0 816 5.1226
11.8751 103.0 824 5.1347
11.5329 104.0 832 5.1491
11.6017 105.0 840 5.1575
11.531 106.0 848 5.1734
11.5597 107.0 856 5.1788
11.6417 108.0 864 5.1775
11.505 109.0 872 5.1912
11.3652 110.0 880 5.2085
11.5079 111.0 888 5.2189
11.3028 112.0 896 5.2182
11.2106 113.0 904 5.2500
11.0981 114.0 912 5.2529
11.3236 115.0 920 5.2631
11.0236 116.0 928 5.2733
11.0307 117.0 936 5.2838
10.89 118.0 944 5.3041
10.9895 119.0 952 5.3033
10.8179 120.0 960 5.3181
10.7459 121.0 968 5.3323
10.86 122.0 976 5.3344
10.757 123.0 984 5.3448
10.5552 124.0 992 5.3627
10.5588 125.0 1000 5.3767
10.4438 126.0 1008 5.3926
10.5047 127.0 1016 5.4032
10.3651 128.0 1024 5.4155
10.3235 129.0 1032 5.4213
10.3174 130.0 1040 5.4393
10.3823 131.0 1048 5.4435
10.0592 132.0 1056 5.4600
10.1487 133.0 1064 5.4781
10.0282 134.0 1072 5.4880
10.1236 135.0 1080 5.5011
9.9891 136.0 1088 5.5159
9.9827 137.0 1096 5.5236
9.8446 138.0 1104 5.5360
9.8503 139.0 1112 5.5493
9.7644 140.0 1120 5.5637
9.6672 141.0 1128 5.5805
9.6267 142.0 1136 5.5836
9.5979 143.0 1144 5.6058
9.5705 144.0 1152 5.6165
9.6073 145.0 1160 5.6226
9.4147 146.0 1168 5.6422
9.4464 147.0 1176 5.6458
9.4894 148.0 1184 5.6520
9.3131 149.0 1192 5.6717
9.2982 150.0 1200 5.6848
9.2379 151.0 1208 5.6961
9.0778 152.0 1216 5.7024
9.1257 153.0 1224 5.7192
9.0748 154.0 1232 5.7331
9.116 155.0 1240 5.7382
9.0424 156.0 1248 5.7551
8.9418 157.0 1256 5.7674
9.0228 158.0 1264 5.7747
8.984 159.0 1272 5.7845
8.8097 160.0 1280 5.8056
8.816 161.0 1288 5.8093
8.7923 162.0 1296 5.8236
8.5981 163.0 1304 5.8417
8.6469 164.0 1312 5.8444
8.6797 165.0 1320 5.8514
8.6748 166.0 1328 5.8551
8.4598 167.0 1336 5.8821
8.6173 168.0 1344 5.8946
8.4841 169.0 1352 5.8973
8.449 170.0 1360 5.9146
8.3006 171.0 1368 5.9171
8.3039 172.0 1376 5.9309
8.4201 173.0 1384 5.9301
8.252 174.0 1392 5.9593
8.1874 175.0 1400 5.9585
8.1726 176.0 1408 5.9657
8.2296 177.0 1416 5.9837
8.1589 178.0 1424 5.9840
8.0798 179.0 1432 5.9979
8.1511 180.0 1440 6.0083
8.0551 181.0 1448 6.0171
8.0453 182.0 1456 6.0317
7.9697 183.0 1464 6.0370
7.8853 184.0 1472 6.0483
7.8942 185.0 1480 6.0599
7.8221 186.0 1488 6.0698
7.8863 187.0 1496 6.0805
7.8503 188.0 1504 6.0890
7.8323 189.0 1512 6.0919
7.6653 190.0 1520 6.1099
7.7381 191.0 1528 6.1087
7.7704 192.0 1536 6.1238
7.6908 193.0 1544 6.1394
7.6836 194.0 1552 6.1372
7.6312 195.0 1560 6.1477
7.6465 196.0 1568 6.1507
7.4888 197.0 1576 6.1590
7.5677 198.0 1584 6.1682
7.4875 199.0 1592 6.1808
7.4614 200.0 1600 6.1874
7.5085 201.0 1608 6.1974
7.4567 202.0 1616 6.2028
7.3157 203.0 1624 6.2131
7.3828 204.0 1632 6.2247
7.3833 205.0 1640 6.2323
7.2833 206.0 1648 6.2361
7.2995 207.0 1656 6.2439
7.2849 208.0 1664 6.2509
7.1774 209.0 1672 6.2628
7.1998 210.0 1680 6.2635
7.1817 211.0 1688 6.2649
7.124 212.0 1696 6.2755
7.1689 213.0 1704 6.2743
7.0623 214.0 1712 6.2975
7.091 215.0 1720 6.2979
7.0882 216.0 1728 6.3007
7.1246 217.0 1736 6.3202
7.1167 218.0 1744 6.3108
7.0163 219.0 1752 6.3240
6.9701 220.0 1760 6.3281
6.9373 221.0 1768 6.3400
6.9172 222.0 1776 6.3392
6.9556 223.0 1784 6.3456
6.963 224.0 1792 6.3445
6.8963 225.0 1800 6.3615
6.8763 226.0 1808 6.3663
6.85 227.0 1816 6.3830
6.8177 228.0 1824 6.3823
6.7342 229.0 1832 6.3883
6.8439 230.0 1840 6.3956
6.8067 231.0 1848 6.3969
6.7436 232.0 1856 6.3948
6.7367 233.0 1864 6.4120
6.683 234.0 1872 6.4211
6.6836 235.0 1880 6.4276
6.7275 236.0 1888 6.4258
6.7124 237.0 1896 6.4273
6.5903 238.0 1904 6.4424
6.6895 239.0 1912 6.4392
6.6896 240.0 1920 6.4418
6.6602 241.0 1928 6.4449
6.6101 242.0 1936 6.4609
6.5861 243.0 1944 6.4578
6.5754 244.0 1952 6.4664
6.6159 245.0 1960 6.4716
6.5653 246.0 1968 6.4679
6.5839 247.0 1976 6.4820
6.5405 248.0 1984 6.4882
6.5113 249.0 1992 6.4818
6.4927 250.0 2000 6.4876
6.4555 251.0 2008 6.4984
6.4701 252.0 2016 6.4968
6.4466 253.0 2024 6.5020
6.4337 254.0 2032 6.5127
6.3718 255.0 2040 6.5140
6.4255 256.0 2048 6.5181
6.4007 257.0 2056 6.5198
6.4029 258.0 2064 6.5252
6.3279 259.0 2072 6.5341
6.277 260.0 2080 6.5370
6.3267 261.0 2088 6.5411
6.2975 262.0 2096 6.5414
6.3183 263.0 2104 6.5425
6.3305 264.0 2112 6.5555
6.3427 265.0 2120 6.5499
6.2753 266.0 2128 6.5595
6.2535 267.0 2136 6.5640
6.2217 268.0 2144 6.5603
6.2643 269.0 2152 6.5651
6.1871 270.0 2160 6.5747
6.2287 271.0 2168 6.5717
6.2535 272.0 2176 6.5824
6.1869 273.0 2184 6.5760
6.1555 274.0 2192 6.5879
6.2114 275.0 2200 6.5850
6.1309 276.0 2208 6.5871
6.1916 277.0 2216 6.5887
6.1759 278.0 2224 6.5979
6.1679 279.0 2232 6.5977
6.1277 280.0 2240 6.6050
6.1643 281.0 2248 6.6039
6.1299 282.0 2256 6.6047
6.1239 283.0 2264 6.6047
6.0947 284.0 2272 6.6153
6.1301 285.0 2280 6.6095
6.0821 286.0 2288 6.6121
6.0621 287.0 2296 6.6189
6.0642 288.0 2304 6.6187
6.1039 289.0 2312 6.6260
6.0503 290.0 2320 6.6257
6.0643 291.0 2328 6.6293
6.0313 292.0 2336 6.6353
6.072 293.0 2344 6.6297
5.9886 294.0 2352 6.6313
6.0222 295.0 2360 6.6336
6.0321 296.0 2368 6.6374
5.9807 297.0 2376 6.6401
5.9905 298.0 2384 6.6393
5.9883 299.0 2392 6.6431
5.984 300.0 2400 6.6501
6.0008 301.0 2408 6.6480
5.9877 302.0 2416 6.6549
5.9699 303.0 2424 6.6554
5.9096 304.0 2432 6.6557
5.9477 305.0 2440 6.6564
5.9377 306.0 2448 6.6552
5.9361 307.0 2456 6.6579
5.9173 308.0 2464 6.6622
5.978 309.0 2472 6.6641
5.9166 310.0 2480 6.6682
5.9207 311.0 2488 6.6652
5.88 312.0 2496 6.6615
5.9201 313.0 2504 6.6670
5.9591 314.0 2512 6.6694
5.8915 315.0 2520 6.6695
5.9303 316.0 2528 6.6693
5.9002 317.0 2536 6.6689
5.9243 318.0 2544 6.6709
5.869 319.0 2552 6.6721
5.9159 320.0 2560 6.6729
5.9035 321.0 2568 6.6754
5.854 322.0 2576 6.6778
5.8641 323.0 2584 6.6788
5.9276 324.0 2592 6.6810
5.8943 325.0 2600 6.6838
5.8702 326.0 2608 6.6825
5.893 327.0 2616 6.6858
5.8256 328.0 2624 6.6843
5.8672 329.0 2632 6.6845
5.8599 330.0 2640 6.6886
5.8604 331.0 2648 6.6858
5.8374 332.0 2656 6.6863
5.8727 333.0 2664 6.6893
5.8593 334.0 2672 6.6904
5.8642 335.0 2680 6.6892
5.8481 336.0 2688 6.6908
5.8547 337.0 2696 6.6921
5.8378 338.0 2704 6.6913
5.8801 339.0 2712 6.6932
5.838 340.0 2720 6.6953
5.8315 341.0 2728 6.6937
5.8522 342.0 2736 6.6962
5.8186 343.0 2744 6.6935
5.8133 344.0 2752 6.6963
5.7997 345.0 2760 6.6947
5.8253 346.0 2768 6.6967
5.8479 347.0 2776 6.6955
5.8557 348.0 2784 6.6981
5.7852 349.0 2792 6.6962
5.8134 350.0 2800 6.6983
5.7735 351.0 2808 6.6959
5.8391 352.0 2816 6.6985
5.8045 353.0 2824 6.6974
5.7944 354.0 2832 6.6977
5.8557 355.0 2840 6.6984
5.8033 356.0 2848 6.6985
5.8719 357.0 2856 6.6990
5.8378 358.0 2864 6.6997
5.8158 359.0 2872 6.6999
5.8151 360.0 2880 6.6994
5.8459 361.0 2888 6.6994
5.8233 362.0 2896 6.7002
5.831 363.0 2904 6.7005
5.84 364.0 2912 6.7007
5.8221 365.0 2920 6.7006
5.8135 366.0 2928 6.7005
5.8016 367.0 2936 6.7004
5.8481 368.0 2944 6.7003
5.7974 369.0 2952 6.7003
5.8429 370.0 2960 6.7002
5.8386 371.0 2968 6.7002
5.8196 372.0 2976 6.7002
5.8117 373.0 2984 6.7002
5.852 374.0 2992 6.7002
5.7905 375.0 3000 6.7002

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
4
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-dutch-mirror-reversal