Natural Order LMs
Collection
All the models trained in the paper 'Natural Order: Cross-lingual Limits of Transformer Language Acquisition'
•
35 items
•
Updated
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
19.5388 | 1.0 | 14 | 9.5505 |
17.9401 | 2.0 | 28 | 8.8721 |
16.7452 | 3.0 | 42 | 8.3110 |
15.9672 | 4.0 | 56 | 7.9213 |
15.2666 | 5.0 | 70 | 7.5166 |
14.221 | 6.0 | 84 | 7.0987 |
13.4173 | 7.0 | 98 | 6.6794 |
12.696 | 8.0 | 112 | 6.2922 |
12.0732 | 9.0 | 126 | 5.9727 |
11.5407 | 10.0 | 140 | 5.7545 |
11.1373 | 11.0 | 154 | 5.6278 |
11.0415 | 12.0 | 168 | 5.5557 |
11.0857 | 13.0 | 182 | 5.4994 |
10.8756 | 14.0 | 196 | 5.4403 |
10.7797 | 15.0 | 210 | 5.3880 |
10.8094 | 16.0 | 224 | 5.3437 |
10.545 | 17.0 | 238 | 5.3122 |
10.4286 | 18.0 | 252 | 5.2743 |
10.4704 | 19.0 | 266 | 5.2344 |
10.4322 | 20.0 | 280 | 5.1965 |
10.3955 | 21.0 | 294 | 5.1627 |
10.1824 | 22.0 | 308 | 5.1350 |
10.3351 | 23.0 | 322 | 5.0994 |
10.0999 | 24.0 | 336 | 5.0613 |
9.9475 | 25.0 | 350 | 5.0338 |
10.0076 | 26.0 | 364 | 5.0021 |
9.8902 | 27.0 | 378 | 4.9742 |
9.7566 | 28.0 | 392 | 4.9420 |
9.5947 | 29.0 | 406 | 4.9075 |
9.5567 | 30.0 | 420 | 4.8829 |
9.6307 | 31.0 | 434 | 4.8543 |
9.5312 | 32.0 | 448 | 4.8264 |
9.631 | 33.0 | 462 | 4.8035 |
9.374 | 34.0 | 476 | 4.7773 |
9.2836 | 35.0 | 490 | 4.7586 |
8.9454 | 36.0 | 504 | 4.7365 |
9.245 | 37.0 | 518 | 4.7158 |
9.3739 | 38.0 | 532 | 4.7004 |
9.1333 | 39.0 | 546 | 4.6807 |
9.1148 | 40.0 | 560 | 4.6652 |
9.0499 | 41.0 | 574 | 4.6520 |
9.1336 | 42.0 | 588 | 4.6394 |
8.8996 | 43.0 | 602 | 4.6258 |
8.9487 | 44.0 | 616 | 4.6114 |
8.7108 | 45.0 | 630 | 4.6009 |
8.8296 | 46.0 | 644 | 4.5939 |
8.8617 | 47.0 | 658 | 4.5823 |
8.6744 | 48.0 | 672 | 4.5743 |
8.6516 | 49.0 | 686 | 4.5677 |
8.7278 | 50.0 | 700 | 4.5603 |
8.754 | 51.0 | 714 | 4.5555 |
8.5846 | 52.0 | 728 | 4.5513 |
8.6258 | 53.0 | 742 | 4.5455 |
8.5361 | 54.0 | 756 | 4.5430 |
8.301 | 55.0 | 770 | 4.5423 |
8.461 | 56.0 | 784 | 4.5390 |
8.3438 | 57.0 | 798 | 4.5387 |
8.3601 | 58.0 | 812 | 4.5399 |
8.2845 | 59.0 | 826 | 4.5374 |
8.3098 | 60.0 | 840 | 4.5390 |
8.1923 | 61.0 | 854 | 4.5396 |
8.188 | 62.0 | 868 | 4.5458 |
8.2348 | 63.0 | 882 | 4.5458 |
8.0945 | 64.0 | 896 | 4.5509 |
8.0231 | 65.0 | 910 | 4.5545 |
8.1322 | 66.0 | 924 | 4.5577 |
8.0243 | 67.0 | 938 | 4.5633 |
7.967 | 68.0 | 952 | 4.5667 |
7.8999 | 69.0 | 966 | 4.5763 |
7.8098 | 70.0 | 980 | 4.5799 |
7.8359 | 71.0 | 994 | 4.5920 |
7.8627 | 72.0 | 1008 | 4.5911 |
7.7559 | 73.0 | 1022 | 4.6069 |
7.753 | 74.0 | 1036 | 4.6096 |
7.7662 | 75.0 | 1050 | 4.6219 |
7.6475 | 76.0 | 1064 | 4.6295 |
7.4705 | 77.0 | 1078 | 4.6425 |
7.5925 | 78.0 | 1092 | 4.6432 |
7.5229 | 79.0 | 1106 | 4.6607 |
7.5707 | 80.0 | 1120 | 4.6689 |
7.4744 | 81.0 | 1134 | 4.6762 |
7.4192 | 82.0 | 1148 | 4.6899 |
7.3259 | 83.0 | 1162 | 4.6976 |
7.3084 | 84.0 | 1176 | 4.7109 |
7.3203 | 85.0 | 1190 | 4.7242 |
7.1939 | 86.0 | 1204 | 4.7307 |
7.1368 | 87.0 | 1218 | 4.7503 |
7.2996 | 88.0 | 1232 | 4.7580 |
7.0555 | 89.0 | 1246 | 4.7690 |
7.1743 | 90.0 | 1260 | 4.7812 |
7.0033 | 91.0 | 1274 | 4.7911 |
7.0944 | 92.0 | 1288 | 4.8034 |
6.893 | 93.0 | 1302 | 4.8147 |
6.9475 | 94.0 | 1316 | 4.8248 |
7.0015 | 95.0 | 1330 | 4.8375 |
6.8675 | 96.0 | 1344 | 4.8429 |
6.8802 | 97.0 | 1358 | 4.8640 |
6.884 | 98.0 | 1372 | 4.8785 |
6.7255 | 99.0 | 1386 | 4.8808 |
6.7309 | 100.0 | 1400 | 4.8979 |
6.7191 | 101.0 | 1414 | 4.9133 |
6.7445 | 102.0 | 1428 | 4.9165 |
6.7745 | 103.0 | 1442 | 4.9349 |
6.5724 | 104.0 | 1456 | 4.9463 |
6.6491 | 105.0 | 1470 | 4.9572 |
6.5131 | 106.0 | 1484 | 4.9676 |
6.6026 | 107.0 | 1498 | 4.9795 |
6.4379 | 108.0 | 1512 | 4.9951 |
6.4879 | 109.0 | 1526 | 5.0042 |
6.5413 | 110.0 | 1540 | 5.0177 |
6.41 | 111.0 | 1554 | 5.0247 |
6.4405 | 112.0 | 1568 | 5.0434 |
6.3561 | 113.0 | 1582 | 5.0554 |
6.364 | 114.0 | 1596 | 5.0739 |
6.3571 | 115.0 | 1610 | 5.0756 |
6.2677 | 116.0 | 1624 | 5.0883 |
6.2884 | 117.0 | 1638 | 5.0951 |
6.3058 | 118.0 | 1652 | 5.1093 |
6.266 | 119.0 | 1666 | 5.1246 |
6.2004 | 120.0 | 1680 | 5.1309 |
6.2179 | 121.0 | 1694 | 5.1415 |
6.1123 | 122.0 | 1708 | 5.1508 |
6.0874 | 123.0 | 1722 | 5.1689 |
6.1809 | 124.0 | 1736 | 5.1704 |
6.1127 | 125.0 | 1750 | 5.1792 |
5.995 | 126.0 | 1764 | 5.1942 |
6.0025 | 127.0 | 1778 | 5.2078 |
5.9974 | 128.0 | 1792 | 5.2143 |
6.0356 | 129.0 | 1806 | 5.2271 |
5.9101 | 130.0 | 1820 | 5.2342 |
6.067 | 131.0 | 1834 | 5.2491 |
5.9895 | 132.0 | 1848 | 5.2499 |
5.9256 | 133.0 | 1862 | 5.2619 |
5.9277 | 134.0 | 1876 | 5.2724 |
5.8285 | 135.0 | 1890 | 5.2761 |
5.8923 | 136.0 | 1904 | 5.2850 |
5.8769 | 137.0 | 1918 | 5.2900 |
5.8877 | 138.0 | 1932 | 5.3080 |
5.8414 | 139.0 | 1946 | 5.3127 |
5.8333 | 140.0 | 1960 | 5.3192 |
5.8625 | 141.0 | 1974 | 5.3270 |
5.7775 | 142.0 | 1988 | 5.3342 |
5.784 | 143.0 | 2002 | 5.3460 |
5.6887 | 144.0 | 2016 | 5.3503 |
5.7328 | 145.0 | 2030 | 5.3657 |
5.6659 | 146.0 | 2044 | 5.3597 |
5.6671 | 147.0 | 2058 | 5.3763 |
5.682 | 148.0 | 2072 | 5.3856 |
5.6351 | 149.0 | 2086 | 5.3962 |
5.6191 | 150.0 | 2100 | 5.3962 |
5.623 | 151.0 | 2114 | 5.4065 |
5.6142 | 152.0 | 2128 | 5.4114 |
5.5707 | 153.0 | 2142 | 5.4184 |
5.5349 | 154.0 | 2156 | 5.4250 |
5.5473 | 155.0 | 2170 | 5.4287 |
5.5398 | 156.0 | 2184 | 5.4351 |
5.5673 | 157.0 | 2198 | 5.4374 |
5.5278 | 158.0 | 2212 | 5.4449 |
5.5364 | 159.0 | 2226 | 5.4522 |
5.5453 | 160.0 | 2240 | 5.4590 |
5.5011 | 161.0 | 2254 | 5.4665 |
5.4966 | 162.0 | 2268 | 5.4745 |
5.4564 | 163.0 | 2282 | 5.4779 |
5.479 | 164.0 | 2296 | 5.4831 |
5.4074 | 165.0 | 2310 | 5.4861 |
5.4466 | 166.0 | 2324 | 5.4911 |
5.4323 | 167.0 | 2338 | 5.4945 |
5.4185 | 168.0 | 2352 | 5.4986 |
5.4081 | 169.0 | 2366 | 5.5008 |
5.4454 | 170.0 | 2380 | 5.5081 |
5.3599 | 171.0 | 2394 | 5.5100 |
5.3657 | 172.0 | 2408 | 5.5110 |
5.3012 | 173.0 | 2422 | 5.5204 |
5.3654 | 174.0 | 2436 | 5.5213 |
5.3618 | 175.0 | 2450 | 5.5238 |
5.3746 | 176.0 | 2464 | 5.5344 |
5.348 | 177.0 | 2478 | 5.5321 |
5.3121 | 178.0 | 2492 | 5.5341 |
5.3438 | 179.0 | 2506 | 5.5416 |
5.3456 | 180.0 | 2520 | 5.5440 |
5.3452 | 181.0 | 2534 | 5.5471 |
5.302 | 182.0 | 2548 | 5.5489 |
5.3466 | 183.0 | 2562 | 5.5487 |
5.3351 | 184.0 | 2576 | 5.5524 |
5.292 | 185.0 | 2590 | 5.5547 |
5.3786 | 186.0 | 2604 | 5.5566 |
5.2728 | 187.0 | 2618 | 5.5592 |
5.3039 | 188.0 | 2632 | 5.5618 |
5.2933 | 189.0 | 2646 | 5.5623 |
5.2462 | 190.0 | 2660 | 5.5642 |
5.2955 | 191.0 | 2674 | 5.5672 |
5.3091 | 192.0 | 2688 | 5.5677 |
5.2952 | 193.0 | 2702 | 5.5695 |
5.2556 | 194.0 | 2716 | 5.5697 |
5.256 | 195.0 | 2730 | 5.5697 |
5.2803 | 196.0 | 2744 | 5.5723 |
5.2666 | 197.0 | 2758 | 5.5742 |
5.282 | 198.0 | 2772 | 5.5750 |
5.2596 | 199.0 | 2786 | 5.5758 |
5.242 | 200.0 | 2800 | 5.5766 |
5.3405 | 201.0 | 2814 | 5.5762 |
5.3114 | 202.0 | 2828 | 5.5767 |
5.2631 | 203.0 | 2842 | 5.5775 |
5.2528 | 204.0 | 2856 | 5.5776 |
5.2509 | 205.0 | 2870 | 5.5780 |
5.3044 | 206.0 | 2884 | 5.5782 |
5.2655 | 207.0 | 2898 | 5.5781 |
5.2337 | 208.0 | 2912 | 5.5785 |
5.2294 | 209.0 | 2926 | 5.5790 |
5.25 | 210.0 | 2940 | 5.5787 |
5.2545 | 211.0 | 2954 | 5.5788 |
5.2359 | 212.0 | 2968 | 5.5789 |
5.2535 | 213.0 | 2982 | 5.5790 |
5.2237 | 214.0 | 2996 | 5.5790 |
21.0264 | 214.3019 | 3000 | 5.5790 |