impossible-llms-spanish-mirror-reversal

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.9937

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
84.4162 0.9180 7 10.3431
77.8512 1.9180 14 9.5818
74.3777 2.9180 21 9.2231
72.9558 3.9180 28 9.0710
72.1401 4.9180 35 8.9695
70.8846 5.9180 42 8.8307
69.6382 6.9180 49 8.6507
67.9279 7.9180 56 8.4302
66.2401 8.9180 63 8.2178
64.6215 9.9180 70 8.0136
62.8811 10.9180 77 7.8174
61.1512 11.9180 84 7.6212
59.7228 12.9180 91 7.4259
58.0425 13.9180 98 7.2282
56.6221 14.9180 105 7.0344
55.1504 15.9180 112 6.8581
53.5989 16.9180 119 6.6987
52.7903 17.9180 126 6.5563
51.6098 18.9180 133 6.4371
50.7139 19.9180 140 6.3459
50.102 20.9180 147 6.2674
49.6932 21.9180 154 6.2228
49.144 22.9180 161 6.1715
48.9015 23.9180 168 6.1300
48.7174 24.9180 175 6.1021
48.2177 25.9180 182 6.0672
48.1698 26.9180 189 6.0371
48.0452 27.9180 196 6.0075
47.6822 28.9180 203 5.9850
47.0775 29.9180 210 5.9656
47.0889 30.9180 217 5.9483
47.0495 31.9180 224 5.9125
46.8446 32.9180 231 5.8966
46.8521 33.9180 238 5.8779
46.5076 34.9180 245 5.8588
46.2375 35.9180 252 5.8437
46.1051 36.9180 259 5.8306
45.9298 37.9180 266 5.8160
45.6758 38.9180 273 5.8035
45.7036 39.9180 280 5.7862
45.5413 40.9180 287 5.7700
45.2449 41.9180 294 5.7602
45.101 42.9180 301 5.7457
44.963 43.9180 308 5.7313
44.778 44.9180 315 5.7205
44.6973 45.9180 322 5.7077
44.3128 46.9180 329 5.6902
44.3436 47.9180 336 5.6782
44.055 48.9180 343 5.6623
43.8327 49.9180 350 5.6461
43.6442 50.9180 357 5.6394
43.3198 51.9180 364 5.6193
43.2727 52.9180 371 5.6041
43.1289 53.9180 378 5.5983
42.9209 54.9180 385 5.5709
42.9342 55.9180 392 5.5631
42.4359 56.9180 399 5.5559
42.3215 57.9180 406 5.5351
42.3673 58.9180 413 5.5182
41.9158 59.9180 420 5.5038
41.9819 60.9180 427 5.4985
41.7746 61.9180 434 5.4811
41.4981 62.9180 441 5.4636
41.2081 63.9180 448 5.4594
41.0731 64.9180 455 5.4432
41.1288 65.9180 462 5.4316
40.8899 66.9180 469 5.4163
40.5089 67.9180 476 5.4167
40.5938 68.9180 483 5.4056
40.1911 69.9180 490 5.3991
39.9726 70.9180 497 5.3848
39.8896 71.9180 504 5.3879
39.7376 72.9180 511 5.3698
39.3989 73.9180 518 5.3695
39.1616 74.9180 525 5.3568
39.1654 75.9180 532 5.3509
39.0076 76.9180 539 5.3506
38.861 77.9180 546 5.3558
38.5128 78.9180 553 5.3403
38.4616 79.9180 560 5.3381
38.415 80.9180 567 5.3444
38.059 81.9180 574 5.3374
38.1389 82.9180 581 5.3389
37.744 83.9180 588 5.3299
37.8217 84.9180 595 5.3213
37.398 85.9180 602 5.3205
37.417 86.9180 609 5.3301
37.2844 87.9180 616 5.3372
37.1349 88.9180 623 5.3331
36.9323 89.9180 630 5.3285
36.8712 90.9180 637 5.3288
36.6164 91.9180 644 5.3333
36.3797 92.9180 651 5.3395
36.1731 93.9180 658 5.3441
35.9206 94.9180 665 5.3456
35.8725 95.9180 672 5.3454
35.7979 96.9180 679 5.3435
35.6521 97.9180 686 5.3471
35.3987 98.9180 693 5.3561
35.3232 99.9180 700 5.3524
35.1982 100.9180 707 5.3661
34.886 101.9180 714 5.3686
34.7132 102.9180 721 5.3709
34.6847 103.9180 728 5.3774
34.5539 104.9180 735 5.3897
34.4671 105.9180 742 5.3989
34.2363 106.9180 749 5.3929
34.0945 107.9180 756 5.3963
33.8505 108.9180 763 5.4139
33.7776 109.9180 770 5.4137
33.7077 110.9180 777 5.4283
33.5768 111.9180 784 5.4255
33.4114 112.9180 791 5.4368
33.124 113.9180 798 5.4533
33.1255 114.9180 805 5.4452
32.9746 115.9180 812 5.4670
32.9757 116.9180 819 5.4674
32.7149 117.9180 826 5.4849
32.4399 118.9180 833 5.4895
32.6289 119.9180 840 5.5009
32.3678 120.9180 847 5.5007
32.1054 121.9180 854 5.5052
31.9792 122.9180 861 5.5272
32.0312 123.9180 868 5.5281
31.8027 124.9180 875 5.5397
31.7089 125.9180 882 5.5513
31.4487 126.9180 889 5.5479
31.3213 127.9180 896 5.5571
31.2986 128.9180 903 5.5726
31.1625 129.9180 910 5.5723
31.0116 130.9180 917 5.5939
30.9386 131.9180 924 5.6084
30.6873 132.9180 931 5.6066
30.5603 133.9180 938 5.6187
30.4922 134.9180 945 5.6356
30.5098 135.9180 952 5.6411
30.3877 136.9180 959 5.6489
30.0047 137.9180 966 5.6620
29.982 138.9180 973 5.6814
29.666 139.9180 980 5.6748
29.7369 140.9180 987 5.6936
29.5357 141.9180 994 5.7011
29.4863 142.9180 1001 5.7023
29.1884 143.9180 1008 5.7173
29.2733 144.9180 1015 5.7391
29.1444 145.9180 1022 5.7430
28.9668 146.9180 1029 5.7628
28.9572 147.9180 1036 5.7740
28.6542 148.9180 1043 5.7703
28.6571 149.9180 1050 5.7768
28.4431 150.9180 1057 5.7939
28.3375 151.9180 1064 5.8060
28.2275 152.9180 1071 5.8154
28.2147 153.9180 1078 5.8239
28.0644 154.9180 1085 5.8309
27.9749 155.9180 1092 5.8453
27.8662 156.9180 1099 5.8514
27.6157 157.9180 1106 5.8644
27.3961 158.9180 1113 5.8822
27.46 159.9180 1120 5.8790
27.5323 160.9180 1127 5.8899
27.1845 161.9180 1134 5.9004
27.1134 162.9180 1141 5.9200
27.0488 163.9180 1148 5.9319
26.798 164.9180 1155 5.9282
26.7074 165.9180 1162 5.9415
26.7968 166.9180 1169 5.9535
26.5976 167.9180 1176 5.9539
26.6141 168.9180 1183 5.9724
26.4868 169.9180 1190 5.9768
26.0997 170.9180 1197 5.9941
26.276 171.9180 1204 6.0098
26.1329 172.9180 1211 6.0084
25.9698 173.9180 1218 6.0104
25.7919 174.9180 1225 6.0339
25.7292 175.9180 1232 6.0430
25.5487 176.9180 1239 6.0521
25.6807 177.9180 1246 6.0665
25.5744 178.9180 1253 6.0639
25.4511 179.9180 1260 6.0746
25.1839 180.9180 1267 6.0985
25.102 181.9180 1274 6.0936
25.2993 182.9180 1281 6.1054
25.0789 183.9180 1288 6.1141
24.9031 184.9180 1295 6.1289
24.9472 185.9180 1302 6.1326
24.7081 186.9180 1309 6.1548
24.5715 187.9180 1316 6.1579
24.5298 188.9180 1323 6.1540
24.4873 189.9180 1330 6.1685
24.2975 190.9180 1337 6.1855
24.2701 191.9180 1344 6.1986
24.2495 192.9180 1351 6.2024
24.1608 193.9180 1358 6.2128
23.9288 194.9180 1365 6.2151
23.9611 195.9180 1372 6.2296
23.8268 196.9180 1379 6.2358
23.6677 197.9180 1386 6.2405
23.7449 198.9180 1393 6.2461
23.4324 199.9180 1400 6.2662
23.4854 200.9180 1407 6.2655
23.3554 201.9180 1414 6.2769
23.167 202.9180 1421 6.2847
23.2855 203.9180 1428 6.2861
23.1166 204.9180 1435 6.3017
23.0398 205.9180 1442 6.3119
23.0255 206.9180 1449 6.3172
22.999 207.9180 1456 6.3255
22.7308 208.9180 1463 6.3393
22.7178 209.9180 1470 6.3417
22.6128 210.9180 1477 6.3461
22.5973 211.9180 1484 6.3585
22.6145 212.9180 1491 6.3666
22.4369 213.9180 1498 6.3749
22.3656 214.9180 1505 6.3802
22.2833 215.9180 1512 6.3983
22.1951 216.9180 1519 6.3926
22.1625 217.9180 1526 6.4041
21.998 218.9180 1533 6.4135
21.991 219.9180 1540 6.4330
21.9023 220.9180 1547 6.4238
21.9138 221.9180 1554 6.4345
21.9563 222.9180 1561 6.4423
21.7125 223.9180 1568 6.4432
21.6526 224.9180 1575 6.4544
21.574 225.9180 1582 6.4674
21.5197 226.9180 1589 6.4717
21.475 227.9180 1596 6.4811
21.3944 228.9180 1603 6.4886
21.4001 229.9180 1610 6.4940
21.2599 230.9180 1617 6.5045
21.3074 231.9180 1624 6.5078
21.0136 232.9180 1631 6.5144
21.1079 233.9180 1638 6.5142
21.0878 234.9180 1645 6.5233
20.9775 235.9180 1652 6.5266
20.9687 236.9180 1659 6.5404
20.7984 237.9180 1666 6.5467
20.7934 238.9180 1673 6.5521
20.7419 239.9180 1680 6.5511
20.5449 240.9180 1687 6.5641
20.6149 241.9180 1694 6.5764
20.6499 242.9180 1701 6.5704
20.5261 243.9180 1708 6.5780
20.4831 244.9180 1715 6.5889
20.4239 245.9180 1722 6.5939
20.2128 246.9180 1729 6.6054
20.1934 247.9180 1736 6.6072
20.1968 248.9180 1743 6.6114
20.1866 249.9180 1750 6.6134
20.104 250.9180 1757 6.6175
20.0609 251.9180 1764 6.6316
20.0985 252.9180 1771 6.6390
19.9381 253.9180 1778 6.6366
19.9409 254.9180 1785 6.6414
19.8636 255.9180 1792 6.6460
19.8073 256.9180 1799 6.6524
19.8491 257.9180 1806 6.6585
19.7852 258.9180 1813 6.6658
19.6229 259.9180 1820 6.6708
19.5722 260.9180 1827 6.6739
19.5835 261.9180 1834 6.6854
19.5987 262.9180 1841 6.6936
19.4856 263.9180 1848 6.6930
19.5904 264.9180 1855 6.6983
19.3708 265.9180 1862 6.7038
19.3553 266.9180 1869 6.7077
19.3373 267.9180 1876 6.7083
19.2019 268.9180 1883 6.7167
19.1206 269.9180 1890 6.7259
19.1018 270.9180 1897 6.7236
19.2208 271.9180 1904 6.7353
19.0552 272.9180 1911 6.7368
19.0681 273.9180 1918 6.7394
19.0372 274.9180 1925 6.7420
19.0147 275.9180 1932 6.7465
18.9359 276.9180 1939 6.7533
18.9365 277.9180 1946 6.7533
18.8647 278.9180 1953 6.7576
18.7693 279.9180 1960 6.7628
18.7637 280.9180 1967 6.7683
18.8001 281.9180 1974 6.7683
18.6263 282.9180 1981 6.7707
18.6731 283.9180 1988 6.7820
18.6376 284.9180 1995 6.7786
18.6834 285.9180 2002 6.7890
18.5305 286.9180 2009 6.7860
18.5434 287.9180 2016 6.7951
18.4738 288.9180 2023 6.7954
18.5392 289.9180 2030 6.8031
18.4525 290.9180 2037 6.8034
18.3516 291.9180 2044 6.8089
18.4084 292.9180 2051 6.8107
18.3341 293.9180 2058 6.8176
18.2295 294.9180 2065 6.8240
18.2289 295.9180 2072 6.8254
18.2957 296.9180 2079 6.8273
18.1978 297.9180 2086 6.8269
18.1374 298.9180 2093 6.8368
18.1589 299.9180 2100 6.8345
18.0843 300.9180 2107 6.8365
18.0587 301.9180 2114 6.8508
17.9929 302.9180 2121 6.8402
17.9596 303.9180 2128 6.8487
17.9953 304.9180 2135 6.8466
17.969 305.9180 2142 6.8532
18.0339 306.9180 2149 6.8526
17.8757 307.9180 2156 6.8600
17.8847 308.9180 2163 6.8596
17.8781 309.9180 2170 6.8660
17.8195 310.9180 2177 6.8693
17.8741 311.9180 2184 6.8666
17.7714 312.9180 2191 6.8732
17.7876 313.9180 2198 6.8768
17.7111 314.9180 2205 6.8820
17.7864 315.9180 2212 6.8847
17.7075 316.9180 2219 6.8839
17.5483 317.9180 2226 6.8867
17.7455 318.9180 2233 6.8962
17.598 319.9180 2240 6.8898
17.6425 320.9180 2247 6.8977
17.6195 321.9180 2254 6.8918
17.5003 322.9180 2261 6.8971
17.5788 323.9180 2268 6.9069
17.5225 324.9180 2275 6.9025
17.5252 325.9180 2282 6.9068
17.5761 326.9180 2289 6.9104
17.4598 327.9180 2296 6.9088
17.3877 328.9180 2303 6.9135
17.3781 329.9180 2310 6.9171
17.4783 330.9180 2317 6.9221
17.295 331.9180 2324 6.9189
17.3924 332.9180 2331 6.9210
17.2561 333.9180 2338 6.9229
17.3171 334.9180 2345 6.9279
17.3314 335.9180 2352 6.9260
17.345 336.9180 2359 6.9280
17.2402 337.9180 2366 6.9335
17.2594 338.9180 2373 6.9359
17.1937 339.9180 2380 6.9325
17.1731 340.9180 2387 6.9320
17.2473 341.9180 2394 6.9390
17.1868 342.9180 2401 6.9378
17.1588 343.9180 2408 6.9383
17.1417 344.9180 2415 6.9439
17.0871 345.9180 2422 6.9438
17.104 346.9180 2429 6.9450
17.1095 347.9180 2436 6.9461
17.1458 348.9180 2443 6.9487
17.0723 349.9180 2450 6.9488
17.1555 350.9180 2457 6.9481
17.107 351.9180 2464 6.9551
17.0555 352.9180 2471 6.9532
17.057 353.9180 2478 6.9550
17.0571 354.9180 2485 6.9561
17.0464 355.9180 2492 6.9564
16.9419 356.9180 2499 6.9552
16.9971 357.9180 2506 6.9591
17.0158 358.9180 2513 6.9612
16.9852 359.9180 2520 6.9609
16.9336 360.9180 2527 6.9651
16.9507 361.9180 2534 6.9685
16.9286 362.9180 2541 6.9668
16.8417 363.9180 2548 6.9698
16.9085 364.9180 2555 6.9729
16.9229 365.9180 2562 6.9705
16.893 366.9180 2569 6.9724
16.8789 367.9180 2576 6.9681
16.8963 368.9180 2583 6.9730
16.8282 369.9180 2590 6.9736
16.8398 370.9180 2597 6.9757
16.8059 371.9180 2604 6.9758
16.8391 372.9180 2611 6.9773
16.9314 373.9180 2618 6.9767
16.8705 374.9180 2625 6.9770
16.7638 375.9180 2632 6.9794
16.8538 376.9180 2639 6.9801
16.7878 377.9180 2646 6.9798
16.786 378.9180 2653 6.9828
16.7546 379.9180 2660 6.9813
16.8046 380.9180 2667 6.9815
16.7852 381.9180 2674 6.9852
16.734 382.9180 2681 6.9834
16.8187 383.9180 2688 6.9820
16.7764 384.9180 2695 6.9857
16.7835 385.9180 2702 6.9861
16.7463 386.9180 2709 6.9860
16.6309 387.9180 2716 6.9865
16.6992 388.9180 2723 6.9881
16.7021 389.9180 2730 6.9872
16.7778 390.9180 2737 6.9873
16.784 391.9180 2744 6.9870
16.7504 392.9180 2751 6.9877
16.7041 393.9180 2758 6.9891
16.7505 394.9180 2765 6.9917
16.7962 395.9180 2772 6.9908
16.7077 396.9180 2779 6.9912
16.7166 397.9180 2786 6.9910
16.7462 398.9180 2793 6.9917
16.713 399.9180 2800 6.9915
16.6515 400.9180 2807 6.9921
16.7043 401.9180 2814 6.9916
16.719 402.9180 2821 6.9915
16.697 403.9180 2828 6.9929
16.7353 404.9180 2835 6.9926
16.7601 405.9180 2842 6.9916
16.6814 406.9180 2849 6.9921
16.7516 407.9180 2856 6.9929
16.6698 408.9180 2863 6.9931
16.6765 409.9180 2870 6.9941
16.6709 410.9180 2877 6.9936
16.7178 411.9180 2884 6.9932
16.6784 412.9180 2891 6.9935
16.7612 413.9180 2898 6.9933
16.7469 414.9180 2905 6.9932
16.6571 415.9180 2912 6.9934
16.6858 416.9180 2919 6.9936
16.6591 417.9180 2926 6.9935
16.7057 418.9180 2933 6.9935
16.7523 419.9180 2940 6.9936
16.7288 420.9180 2947 6.9936
16.6824 421.9180 2954 6.9936
16.6956 422.9180 2961 6.9937
16.659 423.9180 2968 6.9937
16.6825 424.9180 2975 6.9937
16.6794 425.9180 2982 6.9937
16.677 426.9180 2989 6.9937
16.6037 427.9180 2996 6.9937
16.6751 428.5246 3000 6.9937

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-spanish-mirror-reversal