train_cb_1745950320

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the cb dataset. It achieves the following results on the evaluation set:

  • Loss: 3.5088
  • Num Input Tokens Seen: 23078128

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
3.425 3.5133 200 3.6337 116248
3.8051 7.0177 400 3.5796 232144
3.4907 10.5310 600 3.5604 346496
3.5124 14.0354 800 3.5374 462696
3.3202 17.5487 1000 3.5806 578728
3.0515 21.0531 1200 3.5580 692976
3.0487 24.5664 1400 3.5418 809080
3.3479 28.0708 1600 3.5445 924048
2.729 31.5841 1800 3.5421 1040096
2.9491 35.0885 2000 3.5277 1155784
3.4026 38.6018 2200 3.5560 1271880
2.8654 42.1062 2400 3.5492 1386392
3.47 45.6195 2600 3.5358 1502448
2.8545 49.1239 2800 3.5362 1616928
3.1286 52.6372 3000 3.5301 1732240
2.9785 56.1416 3200 3.5395 1847880
2.9734 59.6549 3400 3.5483 1963376
2.5448 63.1593 3600 3.5507 2078344
2.9699 66.6726 3800 3.5546 2193696
3.1915 70.1770 4000 3.5416 2309024
3.4015 73.6903 4200 3.5653 2425544
3.5847 77.1947 4400 3.5412 2539944
3.0721 80.7080 4600 3.5401 2655720
3.0372 84.2124 4800 3.5393 2771904
3.1256 87.7257 5000 3.5283 2887856
2.7119 91.2301 5200 3.5394 3003888
3.3761 94.7434 5400 3.5427 3118800
3.3381 98.2478 5600 3.5265 3234376
3.1174 101.7611 5800 3.5168 3350608
2.7983 105.2655 6000 3.5389 3466256
2.9856 108.7788 6200 3.5365 3582008
2.9597 112.2832 6400 3.5427 3696904
3.0017 115.7965 6600 3.5417 3812728
3.696 119.3009 6800 3.5550 3927256
2.7994 122.8142 7000 3.5606 4043128
2.8761 126.3186 7200 3.5367 4158920
3.3338 129.8319 7400 3.5252 4274536
3.5999 133.3363 7600 3.5412 4389864
3.6932 136.8496 7800 3.5351 4505192
2.9592 140.3540 8000 3.5539 4620656
3.2642 143.8673 8200 3.5257 4736960
2.9824 147.3717 8400 3.5323 4850688
2.9507 150.8850 8600 3.5333 4965800
3.4216 154.3894 8800 3.5413 5082848
2.8698 157.9027 9000 3.5578 5197896
3.4896 161.4071 9200 3.5551 5312976
2.8145 164.9204 9400 3.5369 5428816
3.4116 168.4248 9600 3.5577 5542632
3.6992 171.9381 9800 3.5283 5660064
2.8501 175.4425 10000 3.5443 5775432
3.0806 178.9558 10200 3.5297 5891480
3.1463 182.4602 10400 3.5459 6006016
3.5075 185.9735 10600 3.5435 6121200
2.822 189.4779 10800 3.5489 6236696
2.7503 192.9912 11000 3.5220 6352152
2.9338 196.4956 11200 3.5553 6467792
3.6244 200.0 11400 3.5236 6581880
2.7449 203.5133 11600 3.5427 6697328
3.0423 207.0177 11800 3.5409 6811792
3.0096 210.5310 12000 3.5342 6928248
2.7702 214.0354 12200 3.5190 7043832
3.1358 217.5487 12400 3.5237 7157984
2.9389 221.0531 12600 3.5543 7274032
3.4344 224.5664 12800 3.5662 7390136
3.5665 228.0708 13000 3.5424 7505120
3.8253 231.5841 13200 3.5209 7619616
2.9442 235.0885 13400 3.5688 7736064
3.319 238.6018 13600 3.5466 7850792
3.1764 242.1062 13800 3.5327 7965808
3.5727 245.6195 14000 3.5448 8081552
3.3459 249.1239 14200 3.5463 8197208
3.5196 252.6372 14400 3.5445 8312272
3.4635 256.1416 14600 3.5668 8426888
3.2052 259.6549 14800 3.5377 8542448
2.5383 263.1593 15000 3.5260 8658448
2.9495 266.6726 15200 3.5391 8773608
2.7139 270.1770 15400 3.5480 8887928
3.3391 273.6903 15600 3.5418 9004600
2.8801 277.1947 15800 3.5455 9119624
3.5344 280.7080 16000 3.5489 9233904
3.7411 284.2124 16200 3.5558 9351032
2.7646 287.7257 16400 3.5511 9465944
2.8915 291.2301 16600 3.5570 9581568
3.1433 294.7434 16800 3.5143 9696576
2.7825 298.2478 17000 3.5317 9811496
3.1644 301.7611 17200 3.5339 9926600
3.113 305.2655 17400 3.5569 10042072
3.0783 308.7788 17600 3.5383 10156616
2.9394 312.2832 17800 3.5523 10272688
3.2511 315.7965 18000 3.5404 10386824
3.3896 319.3009 18200 3.5513 10502040
3.4877 322.8142 18400 3.5534 10617608
3.0427 326.3186 18600 3.5439 10731768
2.7773 329.8319 18800 3.5344 10848480
3.0179 333.3363 19000 3.5350 10963328
3.0879 336.8496 19200 3.5526 11078712
3.0143 340.3540 19400 3.5403 11193832
3.287 343.8673 19600 3.5532 11309368
3.5063 347.3717 19800 3.5290 11424912
2.8885 350.8850 20000 3.5471 11539864
2.9811 354.3894 20200 3.5397 11654632
3.3059 357.9027 20400 3.5575 11771008
2.9681 361.4071 20600 3.5233 11886608
3.9619 364.9204 20800 3.5355 12002608
2.9812 368.4248 21000 3.5577 12117448
3.4788 371.9381 21200 3.5370 12233152
3.3682 375.4425 21400 3.5500 12346784
3.231 378.9558 21600 3.5294 12463336
2.8366 382.4602 21800 3.5480 12578616
3.828 385.9735 22000 3.5207 12693160
3.3448 389.4779 22200 3.5252 12808696
2.7932 392.9912 22400 3.5520 12924056
2.5436 396.4956 22600 3.5661 13039656
3.1512 400.0 22800 3.5414 13154552
2.9926 403.5133 23000 3.5694 13269320
2.7649 407.0177 23200 3.5436 13385512
3.6216 410.5310 23400 3.5395 13501208
3.2219 414.0354 23600 3.5469 13617048
3.1651 417.5487 23800 3.5769 13733448
3.1999 421.0531 24000 3.5360 13848288
2.8829 424.5664 24200 3.5558 13963536
3.6948 428.0708 24400 3.5464 14080024
3.5368 431.5841 24600 3.5273 14194520
3.735 435.0885 24800 3.5479 14310080
2.6585 438.6018 25000 3.5429 14427448
3.5032 442.1062 25200 3.5520 14542448
3.2256 445.6195 25400 3.5430 14657640
3.2064 449.1239 25600 3.5425 14772328
2.9344 452.6372 25800 3.5361 14888712
2.7779 456.1416 26000 3.5385 15002944
3.4417 459.6549 26200 3.5638 15118544
3.1454 463.1593 26400 3.5473 15234184
3.2606 466.6726 26600 3.5355 15349544
3.2821 470.1770 26800 3.5506 15465448
3.724 473.6903 27000 3.5527 15581752
2.6703 477.1947 27200 3.5448 15696720
3.0141 480.7080 27400 3.5629 15812864
3.1294 484.2124 27600 3.5562 15928512
3.0489 487.7257 27800 3.5376 16043264
3.1688 491.2301 28000 3.5395 16158992
3.4673 494.7434 28200 3.5358 16274040
3.0875 498.2478 28400 3.5622 16389944
2.8567 501.7611 28600 3.5384 16506208
2.9697 505.2655 28800 3.5358 16621272
3.4483 508.7788 29000 3.5660 16737072
3.6537 512.2832 29200 3.5506 16852312
3.475 515.7965 29400 3.5415 16967744
3.4424 519.3009 29600 3.5482 17083368
3.2134 522.8142 29800 3.5486 17197984
3.1199 526.3186 30000 3.5088 17314032
3.192 529.8319 30200 3.5586 17428904
2.6813 533.3363 30400 3.5461 17543048
2.6821 536.8496 30600 3.5537 17659880
3.6959 540.3540 30800 3.5573 17773728
3.4702 543.8673 31000 3.5528 17889344
3.2742 547.3717 31200 3.5559 18005392
2.6368 550.8850 31400 3.5525 18120296
3.3255 554.3894 31600 3.5570 18235552
2.916 557.9027 31800 3.5544 18352024
3.5079 561.4071 32000 3.5512 18466080
3.4685 564.9204 32200 3.5352 18581584
3.6533 568.4248 32400 3.5441 18697408
3.2041 571.9381 32600 3.5492 18811608
3.1245 575.4425 32800 3.5539 18927640
3.1741 578.9558 33000 3.5364 19043672
3.2603 582.4602 33200 3.5436 19157776
3.4538 585.9735 33400 3.5418 19272744
3.3276 589.4779 33600 3.5411 19388520
3.1365 592.9912 33800 3.5524 19504472
3.3086 596.4956 34000 3.5609 19618408
3.4689 600.0 34200 3.5438 19734128
3.2426 603.5133 34400 3.5543 19849608
3.1304 607.0177 34600 3.5547 19964704
2.7656 610.5310 34800 3.5442 20080968
3.1542 614.0354 35000 3.5390 20195624
2.5122 617.5487 35200 3.5408 20311640
2.6218 621.0531 35400 3.5379 20426832
3.5177 624.5664 35600 3.5434 20541816
3.4989 628.0708 35800 3.5423 20656416
2.4929 631.5841 36000 3.5423 20771136
3.5688 635.0885 36200 3.5374 20886272
3.0984 638.6018 36400 3.5423 21001560
3.573 642.1062 36600 3.5374 21115320
2.8932 645.6195 36800 3.5374 21230216
3.0303 649.1239 37000 3.5374 21344656
2.6008 652.6372 37200 3.5374 21461664
3.19 656.1416 37400 3.5374 21576216
3.3065 659.6549 37600 3.5374 21692088
3.3077 663.1593 37800 3.5374 21807184
3.3042 666.6726 38000 3.5374 21923192
2.9595 670.1770 38200 3.5374 22037928
2.9425 673.6903 38400 3.5374 22153968
3.1744 677.1947 38600 3.5374 22269648
3.4269 680.7080 38800 3.5374 22385640
3.6816 684.2124 39000 3.5374 22502040
3.2897 687.7257 39200 3.5374 22616408
2.8114 691.2301 39400 3.5374 22732496
2.8056 694.7434 39600 3.5374 22846704
3.1919 698.2478 39800 3.5374 22962016
3.0684 701.7611 40000 3.5374 23078128

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_cb_1745950320

Adapter
(481)
this model

Dataset used to train rbelanec/train_cb_1745950320