Natural Order LMs
Collection
All the models trained in the paper 'Natural Order: Cross-lingual Limits of Transformer Language Acquisition'
•
35 items
•
Updated
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
19.3413 | 1.0 | 14 | 9.5560 |
17.8282 | 2.0 | 28 | 8.8858 |
16.7865 | 3.0 | 42 | 8.3290 |
16.2985 | 4.0 | 56 | 7.9541 |
15.2854 | 5.0 | 70 | 7.5663 |
14.1165 | 6.0 | 84 | 7.1540 |
13.6 | 7.0 | 98 | 6.7456 |
12.6682 | 8.0 | 112 | 6.3757 |
12.1537 | 9.0 | 126 | 6.0589 |
11.8659 | 10.0 | 140 | 5.8500 |
11.4423 | 11.0 | 154 | 5.7267 |
11.2484 | 12.0 | 168 | 5.6468 |
11.2254 | 13.0 | 182 | 5.5859 |
11.0824 | 14.0 | 196 | 5.5306 |
10.9985 | 15.0 | 210 | 5.4810 |
10.7426 | 16.0 | 224 | 5.4386 |
10.7897 | 17.0 | 238 | 5.4003 |
10.6788 | 18.0 | 252 | 5.3649 |
10.6826 | 19.0 | 266 | 5.3284 |
10.5997 | 20.0 | 280 | 5.2927 |
10.6804 | 21.0 | 294 | 5.2608 |
10.4569 | 22.0 | 308 | 5.2348 |
10.3926 | 23.0 | 322 | 5.1974 |
10.4329 | 24.0 | 336 | 5.1658 |
10.2911 | 25.0 | 350 | 5.1354 |
10.137 | 26.0 | 364 | 5.1093 |
9.9448 | 27.0 | 378 | 5.0777 |
10.1379 | 28.0 | 392 | 5.0526 |
9.9001 | 29.0 | 406 | 5.0275 |
9.9793 | 30.0 | 420 | 5.0011 |
9.877 | 31.0 | 434 | 4.9731 |
9.7064 | 32.0 | 448 | 4.9436 |
9.7728 | 33.0 | 462 | 4.9232 |
9.7954 | 34.0 | 476 | 4.9039 |
9.7143 | 35.0 | 490 | 4.8785 |
9.5204 | 36.0 | 504 | 4.8604 |
9.5834 | 37.0 | 518 | 4.8411 |
9.5114 | 38.0 | 532 | 4.8253 |
9.4687 | 39.0 | 546 | 4.8085 |
9.5096 | 40.0 | 560 | 4.7967 |
9.3579 | 41.0 | 574 | 4.7805 |
9.3129 | 42.0 | 588 | 4.7687 |
9.2536 | 43.0 | 602 | 4.7568 |
9.2814 | 44.0 | 616 | 4.7451 |
9.0799 | 45.0 | 630 | 4.7389 |
9.117 | 46.0 | 644 | 4.7261 |
9.1623 | 47.0 | 658 | 4.7200 |
9.113 | 48.0 | 672 | 4.7107 |
8.8764 | 49.0 | 686 | 4.7052 |
8.9128 | 50.0 | 700 | 4.7027 |
8.9086 | 51.0 | 714 | 4.6936 |
8.9187 | 52.0 | 728 | 4.6915 |
8.7324 | 53.0 | 742 | 4.6900 |
8.7402 | 54.0 | 756 | 4.6837 |
8.7481 | 55.0 | 770 | 4.6854 |
8.7484 | 56.0 | 784 | 4.6828 |
8.7518 | 57.0 | 798 | 4.6822 |
8.5725 | 58.0 | 812 | 4.6833 |
8.4755 | 59.0 | 826 | 4.6840 |
8.4294 | 60.0 | 840 | 4.6829 |
8.4633 | 61.0 | 854 | 4.6888 |
8.4685 | 62.0 | 868 | 4.6929 |
8.3212 | 63.0 | 882 | 4.6926 |
8.387 | 64.0 | 896 | 4.6999 |
8.3862 | 65.0 | 910 | 4.7042 |
8.2601 | 66.0 | 924 | 4.7073 |
8.2653 | 67.0 | 938 | 4.7148 |
8.2147 | 68.0 | 952 | 4.7240 |
8.1959 | 69.0 | 966 | 4.7297 |
8.1291 | 70.0 | 980 | 4.7366 |
8.0387 | 71.0 | 994 | 4.7429 |
8.1311 | 72.0 | 1008 | 4.7492 |
7.9853 | 73.0 | 1022 | 4.7598 |
7.9861 | 74.0 | 1036 | 4.7649 |
7.917 | 75.0 | 1050 | 4.7758 |
7.8244 | 76.0 | 1064 | 4.7903 |
7.818 | 77.0 | 1078 | 4.7944 |
7.7401 | 78.0 | 1092 | 4.8060 |
7.8455 | 79.0 | 1106 | 4.8160 |
7.8481 | 80.0 | 1120 | 4.8232 |
7.755 | 81.0 | 1134 | 4.8323 |
7.7421 | 82.0 | 1148 | 4.8490 |
7.605 | 83.0 | 1162 | 4.8602 |
7.6062 | 84.0 | 1176 | 4.8701 |
7.5694 | 85.0 | 1190 | 4.8825 |
7.5111 | 86.0 | 1204 | 4.8959 |
7.5136 | 87.0 | 1218 | 4.9056 |
7.3802 | 88.0 | 1232 | 4.9169 |
7.4024 | 89.0 | 1246 | 4.9302 |
7.4278 | 90.0 | 1260 | 4.9438 |
7.2473 | 91.0 | 1274 | 4.9553 |
7.1056 | 92.0 | 1288 | 4.9698 |
7.3367 | 93.0 | 1302 | 4.9804 |
7.2153 | 94.0 | 1316 | 4.9915 |
7.1236 | 95.0 | 1330 | 5.0020 |
7.238 | 96.0 | 1344 | 5.0151 |
7.1268 | 97.0 | 1358 | 5.0281 |
7.0722 | 98.0 | 1372 | 5.0430 |
6.9907 | 99.0 | 1386 | 5.0533 |
7.0275 | 100.0 | 1400 | 5.0652 |
7.1082 | 101.0 | 1414 | 5.0770 |
7.0626 | 102.0 | 1428 | 5.0962 |
6.9378 | 103.0 | 1442 | 5.1016 |
6.896 | 104.0 | 1456 | 5.1189 |
6.864 | 105.0 | 1470 | 5.1320 |
6.8943 | 106.0 | 1484 | 5.1349 |
6.8454 | 107.0 | 1498 | 5.1519 |
6.7281 | 108.0 | 1512 | 5.1649 |
6.7745 | 109.0 | 1526 | 5.1831 |
6.5361 | 110.0 | 1540 | 5.1951 |
6.6865 | 111.0 | 1554 | 5.1988 |
6.6242 | 112.0 | 1568 | 5.2155 |
6.6225 | 113.0 | 1582 | 5.2278 |
6.5798 | 114.0 | 1596 | 5.2335 |
6.556 | 115.0 | 1610 | 5.2532 |
6.5604 | 116.0 | 1624 | 5.2645 |
6.4749 | 117.0 | 1638 | 5.2743 |
6.4891 | 118.0 | 1652 | 5.2869 |
6.4335 | 119.0 | 1666 | 5.2986 |
6.5114 | 120.0 | 1680 | 5.3109 |
6.4408 | 121.0 | 1694 | 5.3212 |
6.3667 | 122.0 | 1708 | 5.3298 |
6.3584 | 123.0 | 1722 | 5.3408 |
6.2831 | 124.0 | 1736 | 5.3542 |
6.3055 | 125.0 | 1750 | 5.3632 |
6.3451 | 126.0 | 1764 | 5.3695 |
6.2636 | 127.0 | 1778 | 5.3902 |
6.1909 | 128.0 | 1792 | 5.3946 |
6.1821 | 129.0 | 1806 | 5.4046 |
6.2121 | 130.0 | 1820 | 5.4137 |
6.2157 | 131.0 | 1834 | 5.4222 |
6.2115 | 132.0 | 1848 | 5.4285 |
6.1631 | 133.0 | 1862 | 5.4377 |
6.1074 | 134.0 | 1876 | 5.4495 |
6.0796 | 135.0 | 1890 | 5.4648 |
6.0416 | 136.0 | 1904 | 5.4746 |
6.1123 | 137.0 | 1918 | 5.4801 |
5.9995 | 138.0 | 1932 | 5.4863 |
6.0616 | 139.0 | 1946 | 5.4929 |
6.0098 | 140.0 | 1960 | 5.4995 |
5.9556 | 141.0 | 1974 | 5.5076 |
5.9591 | 142.0 | 1988 | 5.5204 |
5.9247 | 143.0 | 2002 | 5.5277 |
5.9409 | 144.0 | 2016 | 5.5407 |
5.9081 | 145.0 | 2030 | 5.5536 |
5.8869 | 146.0 | 2044 | 5.5602 |
5.9413 | 147.0 | 2058 | 5.5636 |
5.863 | 148.0 | 2072 | 5.5717 |
5.8174 | 149.0 | 2086 | 5.5750 |
5.7999 | 150.0 | 2100 | 5.5852 |
5.824 | 151.0 | 2114 | 5.5900 |
5.8427 | 152.0 | 2128 | 5.5987 |
5.6974 | 153.0 | 2142 | 5.6064 |
5.7389 | 154.0 | 2156 | 5.6120 |
5.773 | 155.0 | 2170 | 5.6140 |
5.7372 | 156.0 | 2184 | 5.6244 |
5.6788 | 157.0 | 2198 | 5.6267 |
5.694 | 158.0 | 2212 | 5.6346 |
5.6659 | 159.0 | 2226 | 5.6387 |
5.6601 | 160.0 | 2240 | 5.6455 |
5.7282 | 161.0 | 2254 | 5.6535 |
5.6995 | 162.0 | 2268 | 5.6572 |
5.6779 | 163.0 | 2282 | 5.6608 |
5.5655 | 164.0 | 2296 | 5.6728 |
5.6528 | 165.0 | 2310 | 5.6711 |
5.6853 | 166.0 | 2324 | 5.6748 |
5.6575 | 167.0 | 2338 | 5.6860 |
5.6327 | 168.0 | 2352 | 5.6873 |
5.6477 | 169.0 | 2366 | 5.6922 |
5.57 | 170.0 | 2380 | 5.6931 |
5.6212 | 171.0 | 2394 | 5.6994 |
5.5344 | 172.0 | 2408 | 5.7095 |
5.608 | 173.0 | 2422 | 5.7115 |
5.6274 | 174.0 | 2436 | 5.7163 |
5.5226 | 175.0 | 2450 | 5.7169 |
5.6039 | 176.0 | 2464 | 5.7195 |
5.5918 | 177.0 | 2478 | 5.7207 |
5.521 | 178.0 | 2492 | 5.7263 |
5.5004 | 179.0 | 2506 | 5.7269 |
5.5553 | 180.0 | 2520 | 5.7342 |
5.5396 | 181.0 | 2534 | 5.7351 |
5.5434 | 182.0 | 2548 | 5.7390 |
5.4705 | 183.0 | 2562 | 5.7413 |
5.515 | 184.0 | 2576 | 5.7436 |
5.5378 | 185.0 | 2590 | 5.7429 |
5.5125 | 186.0 | 2604 | 5.7467 |
5.5241 | 187.0 | 2618 | 5.7491 |
5.4869 | 188.0 | 2632 | 5.7518 |
5.492 | 189.0 | 2646 | 5.7538 |
5.5174 | 190.0 | 2660 | 5.7542 |
5.4813 | 191.0 | 2674 | 5.7575 |
5.4454 | 192.0 | 2688 | 5.7589 |
5.4896 | 193.0 | 2702 | 5.7597 |
5.3964 | 194.0 | 2716 | 5.7616 |
5.4764 | 195.0 | 2730 | 5.7630 |
5.4792 | 196.0 | 2744 | 5.7635 |
5.3841 | 197.0 | 2758 | 5.7653 |
5.4504 | 198.0 | 2772 | 5.7665 |
5.433 | 199.0 | 2786 | 5.7655 |
5.4426 | 200.0 | 2800 | 5.7659 |
5.4752 | 201.0 | 2814 | 5.7679 |
5.432 | 202.0 | 2828 | 5.7679 |
5.4162 | 203.0 | 2842 | 5.7692 |
5.4561 | 204.0 | 2856 | 5.7694 |
5.3887 | 205.0 | 2870 | 5.7700 |
5.4423 | 206.0 | 2884 | 5.7700 |
5.4008 | 207.0 | 2898 | 5.7699 |
5.4596 | 208.0 | 2912 | 5.7705 |
5.3616 | 209.0 | 2926 | 5.7706 |
5.3832 | 210.0 | 2940 | 5.7709 |
5.4724 | 211.0 | 2954 | 5.7709 |
5.4122 | 212.0 | 2968 | 5.7709 |
5.3969 | 213.0 | 2982 | 5.7709 |
5.4304 | 214.0 | 2996 | 5.7709 |
21.836 | 214.3019 | 3000 | 5.7709 |