impossible-llms-french-natural

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.2486

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 12
  • eval_batch_size: 8
  • seed: 0
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 384
  • total_eval_batch_size: 32
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • training_steps: 3000
  • mixed_precision_training: Native AMP
  • label_smoothing_factor: 0.1

Training results

Training Loss Epoch Step Validation Loss
79.324 0.9412 12 9.7058
72.3794 1.9412 24 8.9807
68.2526 2.9412 36 8.4343
64.0462 3.9412 48 7.9741
60.993 4.9412 60 7.5369
57.7511 5.9412 72 7.1671
55.3603 6.9412 84 6.8273
52.3036 7.9412 96 6.4981
49.539 8.9412 108 6.1508
47.1556 9.9412 120 5.8419
45.0284 10.9412 132 5.5828
43.2922 11.9412 144 5.4049
42.1005 12.9412 156 5.2785
41.5139 13.9412 168 5.1855
41.1069 14.9412 180 5.1035
40.3908 15.9412 192 5.0403
40.0345 16.9412 204 4.9890
39.7093 17.9412 216 4.9451
39.2532 18.9412 228 4.9119
38.8114 19.9412 240 4.8673
38.4829 20.9412 252 4.8454
38.5334 21.9412 264 4.8146
38.2696 22.9412 276 4.7837
37.6478 23.9412 288 4.7537
37.5833 24.9412 300 4.7271
37.4048 25.9412 312 4.6904
37.346 26.9412 324 4.6630
36.897 27.9412 336 4.6363
36.5625 28.9412 348 4.6071
36.5011 29.9412 360 4.5766
35.8165 30.9412 372 4.5556
35.3533 31.9412 384 4.5295
35.8698 32.9412 396 4.5011
35.277 33.9412 408 4.4804
35.0979 34.9412 420 4.4554
34.7479 35.9412 432 4.4292
34.7307 36.9412 444 4.4080
34.4274 37.9412 456 4.3857
34.3423 38.9412 468 4.3634
33.8011 39.9412 480 4.3502
33.6651 40.9412 492 4.3344
33.209 41.9412 504 4.3097
33.3687 42.9412 516 4.2959
33.036 43.9412 528 4.2852
33.4063 44.9412 540 4.2670
33.1232 45.9412 552 4.2543
32.2983 46.9412 564 4.2390
32.7942 47.9412 576 4.2320
32.1223 48.9412 588 4.2141
32.3692 49.9412 600 4.2068
32.0662 50.9412 612 4.1979
31.8694 51.9412 624 4.1905
31.8273 52.9412 636 4.1836
31.2749 53.9412 648 4.1809
30.9584 54.9412 660 4.1706
30.9139 55.9412 672 4.1651
30.9499 56.9412 684 4.1591
30.8249 57.9412 696 4.1582
30.9261 58.9412 708 4.1576
30.4192 59.9412 720 4.1529
30.4116 60.9412 732 4.1543
30.1574 61.9412 744 4.1553
30.2333 62.9412 756 4.1559
30.1538 63.9412 768 4.1501
30.0845 64.9412 780 4.1590
29.7537 65.9412 792 4.1525
29.525 66.9412 804 4.1600
29.3879 67.9412 816 4.1600
29.3478 68.9412 828 4.1655
29.1908 69.9412 840 4.1659
29.1259 70.9412 852 4.1710
28.809 71.9412 864 4.1776
28.777 72.9412 876 4.1767
28.8071 73.9412 888 4.1846
28.51 74.9412 900 4.1886
28.4728 75.9412 912 4.1951
28.2195 76.9412 924 4.2101
27.6937 77.9412 936 4.2139
27.9026 78.9412 948 4.2165
27.9768 79.9412 960 4.2249
27.891 80.9412 972 4.2337
27.6059 81.9412 984 4.2375
27.1082 82.9412 996 4.2553
27.262 83.9412 1008 4.2589
27.0908 84.9412 1020 4.2688
27.1283 85.9412 1032 4.2722
26.9426 86.9412 1044 4.2844
26.7563 87.9412 1056 4.2934
26.4637 88.9412 1068 4.3053
26.5164 89.9412 1080 4.3172
26.5367 90.9412 1092 4.3203
26.2075 91.9412 1104 4.3315
26.0509 92.9412 1116 4.3422
26.1038 93.9412 1128 4.3512
25.8243 94.9412 1140 4.3656
25.6604 95.9412 1152 4.3696
25.8441 96.9412 1164 4.3794
25.6414 97.9412 1176 4.3949
25.4818 98.9412 1188 4.4061
25.0597 99.9412 1200 4.4115
25.2807 100.9412 1212 4.4259
24.908 101.9412 1224 4.4360
25.0106 102.9412 1236 4.4507
24.6303 103.9412 1248 4.4523
24.7361 104.9412 1260 4.4680
24.5017 105.9412 1272 4.4780
24.4685 106.9412 1284 4.4909
24.3184 107.9412 1296 4.5046
24.3709 108.9412 1308 4.5098
24.1898 109.9412 1320 4.5212
24.0466 110.9412 1332 4.5402
23.9473 111.9412 1344 4.5464
23.8739 112.9412 1356 4.5515
23.777 113.9412 1368 4.5649
23.8447 114.9412 1380 4.5771
23.5075 115.9412 1392 4.5869
23.4193 116.9412 1404 4.5980
23.3085 117.9412 1416 4.6058
23.1388 118.9412 1428 4.6164
23.2893 119.9412 1440 4.6364
23.2012 120.9412 1452 4.6416
23.0239 121.9412 1464 4.6529
22.8655 122.9412 1476 4.6590
22.8033 123.9412 1488 4.6679
22.6793 124.9412 1500 4.6771
22.5078 125.9412 1512 4.6917
22.4668 126.9412 1524 4.7008
22.4739 127.9412 1536 4.7123
22.4229 128.9412 1548 4.7207
22.2832 129.9412 1560 4.7328
22.1788 130.9412 1572 4.7404
22.0455 131.9412 1584 4.7579
22.0092 132.9412 1596 4.7611
22.0439 133.9412 1608 4.7722
21.6403 134.9412 1620 4.7788
21.8553 135.9412 1632 4.7883
21.8224 136.9412 1644 4.7972
21.5326 137.9412 1656 4.8053
21.6021 138.9412 1668 4.8190
21.4662 139.9412 1680 4.8186
21.3165 140.9412 1692 4.8309
21.3277 141.9412 1704 4.8419
21.1161 142.9412 1716 4.8485
21.2015 143.9412 1728 4.8605
21.1416 144.9412 1740 4.8709
21.0827 145.9412 1752 4.8813
20.9501 146.9412 1764 4.8844
20.798 147.9412 1776 4.8962
20.7905 148.9412 1788 4.8984
20.7474 149.9412 1800 4.9068
20.8342 150.9412 1812 4.9165
20.6189 151.9412 1824 4.9224
20.5318 152.9412 1836 4.9337
20.5672 153.9412 1848 4.9412
20.4254 154.9412 1860 4.9431
20.4373 155.9412 1872 4.9519
20.4269 156.9412 1884 4.9637
20.2548 157.9412 1896 4.9697
20.2438 158.9412 1908 4.9783
20.2953 159.9412 1920 4.9849
20.0807 160.9412 1932 4.9913
20.1011 161.9412 1944 5.0040
20.1606 162.9412 1956 5.0104
20.0555 163.9412 1968 5.0061
19.963 164.9412 1980 5.0254
19.9353 165.9412 1992 5.0286
19.8338 166.9412 2004 5.0380
19.6825 167.9412 2016 5.0357
19.8886 168.9412 2028 5.0462
19.6658 169.9412 2040 5.0505
19.7059 170.9412 2052 5.0552
19.6489 171.9412 2064 5.0638
19.6295 172.9412 2076 5.0661
19.5565 173.9412 2088 5.0754
19.4673 174.9412 2100 5.0830
19.3853 175.9412 2112 5.0811
19.3834 176.9412 2124 5.0911
19.3719 177.9412 2136 5.0933
19.3406 178.9412 2148 5.1034
19.1439 179.9412 2160 5.1071
19.2157 180.9412 2172 5.1134
19.1796 181.9412 2184 5.1177
19.2714 182.9412 2196 5.1199
19.1124 183.9412 2208 5.1248
19.1641 184.9412 2220 5.1313
19.0553 185.9412 2232 5.1375
19.1456 186.9412 2244 5.1406
19.0271 187.9412 2256 5.1447
18.981 188.9412 2268 5.1437
18.9321 189.9412 2280 5.1551
18.9404 190.9412 2292 5.1554
18.9149 191.9412 2304 5.1590
18.9277 192.9412 2316 5.1617
18.8459 193.9412 2328 5.1654
18.8936 194.9412 2340 5.1701
18.8323 195.9412 2352 5.1760
18.7228 196.9412 2364 5.1775
18.7283 197.9412 2376 5.1801
18.7426 198.9412 2388 5.1857
18.6973 199.9412 2400 5.1859
18.671 200.9412 2412 5.1868
18.6349 201.9412 2424 5.1922
18.6011 202.9412 2436 5.1962
18.5998 203.9412 2448 5.1966
18.6944 204.9412 2460 5.2007
18.4289 205.9412 2472 5.2047
18.502 206.9412 2484 5.2064
18.5419 207.9412 2496 5.2098
18.6133 208.9412 2508 5.2112
18.5158 209.9412 2520 5.2153
18.4378 210.9412 2532 5.2143
18.4555 211.9412 2544 5.2181
18.4691 212.9412 2556 5.2200
18.4171 213.9412 2568 5.2241
18.3529 214.9412 2580 5.2246
18.4098 215.9412 2592 5.2286
18.3957 216.9412 2604 5.2273
18.3127 217.9412 2616 5.2274
18.3269 218.9412 2628 5.2321
18.3685 219.9412 2640 5.2317
18.375 220.9412 2652 5.2354
18.4309 221.9412 2664 5.2371
18.3184 222.9412 2676 5.2376
18.2647 223.9412 2688 5.2367
18.3628 224.9412 2700 5.2384
18.363 225.9412 2712 5.2392
18.3033 226.9412 2724 5.2406
18.2284 227.9412 2736 5.2416
18.2331 228.9412 2748 5.2420
18.2397 229.9412 2760 5.2425
18.299 230.9412 2772 5.2436
18.2834 231.9412 2784 5.2443
18.0989 232.9412 2796 5.2450
18.1968 233.9412 2808 5.2455
18.2266 234.9412 2820 5.2456
18.1434 235.9412 2832 5.2467
18.2138 236.9412 2844 5.2462
18.2174 237.9412 2856 5.2474
18.308 238.9412 2868 5.2477
18.2234 239.9412 2880 5.2475
18.282 240.9412 2892 5.2485
18.3176 241.9412 2904 5.2482
18.1872 242.9412 2916 5.2482
18.2816 243.9412 2928 5.2482
18.1266 244.9412 2940 5.2484
18.2928 245.9412 2952 5.2485
18.1919 246.9412 2964 5.2485
18.2035 247.9412 2976 5.2486
18.237 248.9412 2988 5.2486
18.2801 249.9412 3000 5.2486

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.4.0+cu121
  • Datasets 3.4.0
  • Tokenizers 0.21.0
Downloads last month
3
Safetensors
Model size
126M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including IParraMartin/impossible-llms-french-natural