impossible-llms-spanish-random

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 8.1724

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
82.972 1.0 8 10.1437
75.8665 2.0 16 9.4079
72.9421 3.0 24 9.0817
71.5898 4.0 32 8.9488
70.6476 5.0 40 8.8065
68.8115 6.0 48 8.6534
67.8914 7.0 56 8.4742
66.0709 8.0 64 8.2686
64.3698 9.0 72 8.0523
62.7776 10.0 80 7.8382
60.7833 11.0 88 7.6234
59.2516 12.0 96 7.4120
57.4926 13.0 104 7.2052
55.9101 14.0 112 7.0040
54.4078 15.0 120 6.8264
53.194 16.0 128 6.6852
52.4471 17.0 136 6.5875
51.8722 18.0 144 6.5158
51.3478 19.0 152 6.4705
50.9916 20.0 160 6.4311
50.7082 21.0 168 6.3975
50.6366 22.0 176 6.3702
50.343 23.0 184 6.3389
49.9145 24.0 192 6.3057
49.838 25.0 200 6.2873
49.7563 26.0 208 6.2540
49.5158 27.0 216 6.2257
49.2432 28.0 224 6.2086
48.762 29.0 232 6.1814
48.8961 30.0 240 6.1708
48.4002 31.0 248 6.1538
48.3542 32.0 256 6.1337
48.1679 33.0 264 6.1212
47.8023 34.0 272 6.1096
47.7961 35.0 280 6.0889
47.7335 36.0 288 6.0733
47.5702 37.0 296 6.0695
47.2494 38.0 304 6.0505
47.2174 39.0 312 6.0415
47.1788 40.0 320 6.0221
46.554 41.0 328 6.0133
46.7689 42.0 336 5.9990
46.4913 43.0 344 5.9892
46.1306 44.0 352 5.9711
46.1097 45.0 360 5.9614
45.7164 46.0 368 5.9520
45.5182 47.0 376 5.9496
45.4885 48.0 384 5.9320
45.1577 49.0 392 5.9272
45.0292 50.0 400 5.9191
44.8174 51.0 408 5.9093
44.6519 52.0 416 5.9019
44.6193 53.0 424 5.8935
44.3462 54.0 432 5.8878
44.0637 55.0 440 5.8897
44.1169 56.0 448 5.8865
43.9161 57.0 456 5.8805
43.7147 58.0 464 5.8861
43.7084 59.0 472 5.8782
43.2609 60.0 480 5.8831
43.2273 61.0 488 5.8801
42.7276 62.0 496 5.8861
43.1133 63.0 504 5.8880
42.7384 64.0 512 5.8816
42.5384 65.0 520 5.8900
42.2051 66.0 528 5.8990
41.9476 67.0 536 5.9014
41.9329 68.0 544 5.9072
41.7742 69.0 552 5.9140
41.463 70.0 560 5.9210
41.4958 71.0 568 5.9284
41.136 72.0 576 5.9301
40.956 73.0 584 5.9540
40.8498 74.0 592 5.9539
40.5533 75.0 600 5.9616
40.2274 76.0 608 5.9715
40.3252 77.0 616 5.9883
39.8976 78.0 624 6.0010
39.8581 79.0 632 6.0068
39.8272 80.0 640 6.0155
39.4982 81.0 648 6.0321
39.5351 82.0 656 6.0410
38.9904 83.0 664 6.0553
39.1677 84.0 672 6.0767
38.8723 85.0 680 6.0767
38.5522 86.0 688 6.0952
38.4087 87.0 696 6.1166
38.1894 88.0 704 6.1213
38.0162 89.0 712 6.1408
37.7106 90.0 720 6.1528
37.5611 91.0 728 6.1629
37.5819 92.0 736 6.1787
37.2784 93.0 744 6.2005
37.1369 94.0 752 6.2101
36.9646 95.0 760 6.2348
36.5899 96.0 768 6.2418
36.4853 97.0 776 6.2503
36.2162 98.0 784 6.2787
36.0686 99.0 792 6.2848
36.0744 100.0 800 6.2993
35.6287 101.0 808 6.3291
35.7513 102.0 816 6.3400
35.4872 103.0 824 6.3426
35.3318 104.0 832 6.3629
34.9538 105.0 840 6.3858
34.691 106.0 848 6.3975
34.5705 107.0 856 6.4129
34.4758 108.0 864 6.4285
34.2078 109.0 872 6.4490
34.1076 110.0 880 6.4658
34.0043 111.0 888 6.4769
34.0075 112.0 896 6.4995
33.6314 113.0 904 6.5134
33.3817 114.0 912 6.5242
33.1647 115.0 920 6.5635
33.0683 116.0 928 6.5620
32.7977 117.0 936 6.5616
32.8607 118.0 944 6.5835
32.6613 119.0 952 6.6042
32.4106 120.0 960 6.6099
32.3315 121.0 968 6.6254
32.1132 122.0 976 6.6463
31.9501 123.0 984 6.6597
31.7851 124.0 992 6.6751
31.4313 125.0 1000 6.6976
31.5001 126.0 1008 6.7109
31.3214 127.0 1016 6.7316
31.1682 128.0 1024 6.7486
30.9942 129.0 1032 6.7701
30.8465 130.0 1040 6.7661
30.7709 131.0 1048 6.7948
30.398 132.0 1056 6.8041
30.3257 133.0 1064 6.8126
30.3051 134.0 1072 6.8455
30.0202 135.0 1080 6.8592
29.9442 136.0 1088 6.8462
29.7463 137.0 1096 6.8768
29.6297 138.0 1104 6.8716
29.4535 139.0 1112 6.9012
29.4787 140.0 1120 6.9159
29.1604 141.0 1128 6.9202
29.0881 142.0 1136 6.9359
28.9373 143.0 1144 6.9679
28.8773 144.0 1152 6.9782
28.7431 145.0 1160 6.9778
28.6143 146.0 1168 6.9963
28.3066 147.0 1176 7.0068
28.2248 148.0 1184 7.0118
28.2809 149.0 1192 7.0317
28.0549 150.0 1200 7.0388
27.9011 151.0 1208 7.0743
27.8213 152.0 1216 7.0790
27.6611 153.0 1224 7.0833
27.4538 154.0 1232 7.1158
27.4113 155.0 1240 7.1376
27.3292 156.0 1248 7.1283
27.1707 157.0 1256 7.1569
27.0338 158.0 1264 7.1534
26.9888 159.0 1272 7.1748
26.8195 160.0 1280 7.1789
26.5915 161.0 1288 7.1898
26.59 162.0 1296 7.1993
26.3942 163.0 1304 7.2154
26.4012 164.0 1312 7.2217
26.174 165.0 1320 7.2349
25.9937 166.0 1328 7.2536
26.0928 167.0 1336 7.2645
25.9871 168.0 1344 7.2783
25.7454 169.0 1352 7.2914
25.6648 170.0 1360 7.2922
25.5124 171.0 1368 7.3141
25.3788 172.0 1376 7.3069
25.465 173.0 1384 7.3387
25.2841 174.0 1392 7.3320
25.0767 175.0 1400 7.3440
25.1146 176.0 1408 7.3695
25.0539 177.0 1416 7.3708
24.9689 178.0 1424 7.3734
24.601 179.0 1432 7.4054
24.6286 180.0 1440 7.4016
24.5223 181.0 1448 7.4134
24.5607 182.0 1456 7.4229
24.4185 183.0 1464 7.4404
24.1607 184.0 1472 7.4465
24.1936 185.0 1480 7.4559
24.0903 186.0 1488 7.4554
24.1294 187.0 1496 7.4836
24.0333 188.0 1504 7.4779
23.9775 189.0 1512 7.4905
23.734 190.0 1520 7.5246
23.665 191.0 1528 7.5111
23.5242 192.0 1536 7.5182
23.4995 193.0 1544 7.5242
23.3869 194.0 1552 7.5368
23.3194 195.0 1560 7.5561
23.2282 196.0 1568 7.5571
23.1417 197.0 1576 7.5627
23.1608 198.0 1584 7.5738
22.9937 199.0 1592 7.5883
22.9023 200.0 1600 7.5910
22.8301 201.0 1608 7.6062
22.839 202.0 1616 7.6199
22.6699 203.0 1624 7.6279
22.6976 204.0 1632 7.6251
22.4869 205.0 1640 7.6344
22.5602 206.0 1648 7.6426
22.3682 207.0 1656 7.6510
22.3643 208.0 1664 7.6597
22.4216 209.0 1672 7.6692
22.295 210.0 1680 7.6737
22.1837 211.0 1688 7.6762
21.9896 212.0 1696 7.7008
22.0444 213.0 1704 7.6955
22.0932 214.0 1712 7.7042
22.0176 215.0 1720 7.7077
21.7476 216.0 1728 7.7200
21.7287 217.0 1736 7.7299
21.7611 218.0 1744 7.7431
21.6926 219.0 1752 7.7508
21.6303 220.0 1760 7.7481
21.6038 221.0 1768 7.7627
21.4299 222.0 1776 7.7623
21.3599 223.0 1784 7.7765
21.4066 224.0 1792 7.7826
21.1809 225.0 1800 7.7892
21.2748 226.0 1808 7.8009
21.3174 227.0 1816 7.8090
21.0702 228.0 1824 7.8045
21.0049 229.0 1832 7.8039
20.9373 230.0 1840 7.8277
21.0195 231.0 1848 7.8261
20.9109 232.0 1856 7.8388
20.8391 233.0 1864 7.8440
20.8554 234.0 1872 7.8535
20.6738 235.0 1880 7.8542
20.6079 236.0 1888 7.8567
20.6093 237.0 1896 7.8668
20.5409 238.0 1904 7.8744
20.4727 239.0 1912 7.8772
20.4992 240.0 1920 7.8811
20.3505 241.0 1928 7.8905
20.4625 242.0 1936 7.8894
20.3406 243.0 1944 7.8973
20.2562 244.0 1952 7.9066
20.1959 245.0 1960 7.9082
20.1324 246.0 1968 7.9125
20.1758 247.0 1976 7.9254
20.1901 248.0 1984 7.9210
20.0953 249.0 1992 7.9278
19.9865 250.0 2000 7.9338
19.9955 251.0 2008 7.9386
20.0445 252.0 2016 7.9394
19.7181 253.0 2024 7.9515
19.8769 254.0 2032 7.9557
19.7927 255.0 2040 7.9631
19.7656 256.0 2048 7.9625
19.729 257.0 2056 7.9690
19.7746 258.0 2064 7.9742
19.7607 259.0 2072 7.9804
19.577 260.0 2080 7.9826
19.5543 261.0 2088 7.9884
19.5187 262.0 2096 7.9923
19.5525 263.0 2104 7.9918
19.4421 264.0 2112 8.0028
19.4744 265.0 2120 7.9992
19.4247 266.0 2128 8.0032
19.3781 267.0 2136 8.0096
19.3096 268.0 2144 8.0175
19.3122 269.0 2152 8.0165
19.2698 270.0 2160 8.0216
19.3156 271.0 2168 8.0266
19.218 272.0 2176 8.0304
19.1812 273.0 2184 8.0270
19.1861 274.0 2192 8.0371
19.2505 275.0 2200 8.0363
19.0715 276.0 2208 8.0451
19.0956 277.0 2216 8.0520
19.0811 278.0 2224 8.0517
18.9746 279.0 2232 8.0554
19.0338 280.0 2240 8.0611
18.9882 281.0 2248 8.0619
18.894 282.0 2256 8.0615
18.8913 283.0 2264 8.0667
18.9493 284.0 2272 8.0657
18.8434 285.0 2280 8.0686
18.8559 286.0 2288 8.0732
18.8983 287.0 2296 8.0761
18.7152 288.0 2304 8.0779
18.7383 289.0 2312 8.0811
18.7166 290.0 2320 8.0861
18.6856 291.0 2328 8.0899
18.7324 292.0 2336 8.0886
18.6808 293.0 2344 8.0957
18.5322 294.0 2352 8.0955
18.6197 295.0 2360 8.0970
18.496 296.0 2368 8.1006
18.6525 297.0 2376 8.1045
18.5264 298.0 2384 8.1058
18.5063 299.0 2392 8.1092
18.5643 300.0 2400 8.1089
18.586 301.0 2408 8.1150
18.4556 302.0 2416 8.1101
18.4819 303.0 2424 8.1163
18.378 304.0 2432 8.1196
18.4967 305.0 2440 8.1191
18.3321 306.0 2448 8.1202
18.4337 307.0 2456 8.1193
18.3281 308.0 2464 8.1272
18.3006 309.0 2472 8.1311
18.2738 310.0 2480 8.1309
18.3395 311.0 2488 8.1294
18.292 312.0 2496 8.1290
18.2963 313.0 2504 8.1356
18.2223 314.0 2512 8.1344
18.1901 315.0 2520 8.1355
18.3431 316.0 2528 8.1415
18.2177 317.0 2536 8.1419
18.2611 318.0 2544 8.1423
18.2588 319.0 2552 8.1423
18.1111 320.0 2560 8.1466
18.2206 321.0 2568 8.1455
18.0979 322.0 2576 8.1510
18.0469 323.0 2584 8.1501
18.2613 324.0 2592 8.1539
18.0968 325.0 2600 8.1486
18.1836 326.0 2608 8.1546
18.1618 327.0 2616 8.1538
18.1316 328.0 2624 8.1556
18.0576 329.0 2632 8.1540
18.0833 330.0 2640 8.1584
18.1602 331.0 2648 8.1636
18.1423 332.0 2656 8.1587
18.0728 333.0 2664 8.1621
18.1527 334.0 2672 8.1604
18.0712 335.0 2680 8.1610
17.9818 336.0 2688 8.1640
18.0128 337.0 2696 8.1609
18.1254 338.0 2704 8.1635
18.078 339.0 2712 8.1650
17.9944 340.0 2720 8.1635
18.0741 341.0 2728 8.1650
18.0014 342.0 2736 8.1663
18.0411 343.0 2744 8.1675
17.983 344.0 2752 8.1654
17.8898 345.0 2760 8.1692
17.9698 346.0 2768 8.1691
17.9761 347.0 2776 8.1677
18.009 348.0 2784 8.1698
17.9711 349.0 2792 8.1693
18.0451 350.0 2800 8.1700
18.0299 351.0 2808 8.1699
18.0121 352.0 2816 8.1702
17.957 353.0 2824 8.1698
17.9758 354.0 2832 8.1699
18.0933 355.0 2840 8.1711
18.0287 356.0 2848 8.1707
17.9982 357.0 2856 8.1711
17.9634 358.0 2864 8.1715
18.0232 359.0 2872 8.1713
17.9229 360.0 2880 8.1713
18.0117 361.0 2888 8.1711
18.0183 362.0 2896 8.1708
17.9465 363.0 2904 8.1713
17.9505 364.0 2912 8.1719
17.9601 365.0 2920 8.1718
17.9163 366.0 2928 8.1719
18.0543 367.0 2936 8.1720
17.9808 368.0 2944 8.1722
17.9907 369.0 2952 8.1722
17.9784 370.0 2960 8.1724
17.9297 371.0 2968 8.1724
17.9071 372.0 2976 8.1725
18.0145 373.0 2984 8.1724
17.9581 374.0 2992 8.1724
18.0314 375.0 3000 8.1724

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-spanish-random