Natural Order LMs
Collection
All the models trained in the paper 'Natural Order: Cross-lingual Limits of Transformer Language Acquisition'
•
35 items
•
Updated
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
More information needed
More information needed
More information needed
The following hyperparameters were used during training:
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
22.3579 | 1.0 | 87 | 7.3625 |
17.7285 | 2.0 | 174 | 5.9282 |
17.3131 | 3.0 | 261 | 5.7417 |
16.9484 | 4.0 | 348 | 5.5702 |
16.2015 | 5.0 | 435 | 5.3600 |
15.635 | 6.0 | 522 | 5.1832 |
15.2242 | 7.0 | 609 | 5.0535 |
14.9803 | 8.0 | 696 | 4.9441 |
14.693 | 9.0 | 783 | 4.8592 |
14.4182 | 10.0 | 870 | 4.7920 |
14.3186 | 11.0 | 957 | 4.7325 |
14.0921 | 12.0 | 1044 | 4.6868 |
13.8969 | 13.0 | 1131 | 4.6437 |
13.8353 | 14.0 | 1218 | 4.6098 |
13.6798 | 15.0 | 1305 | 4.5795 |
13.637 | 16.0 | 1392 | 4.5563 |
13.5227 | 17.0 | 1479 | 4.5350 |
13.4718 | 18.0 | 1566 | 4.5154 |
13.2136 | 19.0 | 1653 | 4.4986 |
13.3515 | 20.0 | 1740 | 4.4878 |
13.2931 | 21.0 | 1827 | 4.4752 |
13.1062 | 22.0 | 1914 | 4.4651 |
13.1325 | 23.0 | 2001 | 4.4568 |
13.0963 | 24.0 | 2088 | 4.4508 |
13.1318 | 25.0 | 2175 | 4.4443 |
12.8938 | 26.0 | 2262 | 4.4397 |
12.935 | 27.0 | 2349 | 4.4364 |
13.1248 | 28.0 | 2436 | 4.4331 |
12.9068 | 29.0 | 2523 | 4.4304 |
12.8866 | 30.0 | 2610 | 4.4293 |
12.9587 | 31.0 | 2697 | 4.4282 |
12.8039 | 32.0 | 2784 | 4.4273 |
12.7212 | 33.0 | 2871 | 4.4270 |
12.8857 | 34.0 | 2958 | 4.4268 |
34.5151 | 34.4863 | 3000 | 4.4268 |