train_boolq_1745950282

This model is a fine-tuned version of mistralai/Mistral-7B-Instruct-v0.3 on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 1.5235
  • Num Input Tokens Seen: 37097424

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.9126 0.0943 200 1.6075 186768
1.8665 0.1886 400 1.5869 369808
2.3282 0.2829 600 1.5783 554928
1.2111 0.3772 800 1.5787 746560
0.6563 0.4715 1000 1.5645 932848
1.6293 0.5658 1200 1.5633 1116128
1.9335 0.6601 1400 1.5533 1299664
2.187 0.7544 1600 1.5488 1481856
1.1897 0.8487 1800 1.5607 1672160
1.521 0.9430 2000 1.5519 1860608
1.5792 1.0372 2200 1.5547 2047984
1.9829 1.1315 2400 1.5497 2230960
2.2104 1.2258 2600 1.5496 2417664
1.5622 1.3201 2800 1.5517 2600368
0.5567 1.4144 3000 1.5504 2786848
1.7935 1.5087 3200 1.5425 2972672
0.828 1.6030 3400 1.5452 3154640
1.7858 1.6973 3600 1.5419 3339328
1.5979 1.7916 3800 1.5441 3522384
1.5046 1.8859 4000 1.5455 3712352
0.8123 1.9802 4200 1.5366 3899328
1.9172 2.0745 4400 1.5408 4085888
1.7031 2.1688 4600 1.5413 4271936
1.2118 2.2631 4800 1.5362 4456320
1.3216 2.3574 5000 1.5418 4638512
2.1133 2.4517 5200 1.5425 4830688
1.5954 2.5460 5400 1.5369 5016480
1.5956 2.6403 5600 1.5407 5204048
1.2136 2.7346 5800 1.5380 5383984
2.0436 2.8289 6000 1.5334 5574016
2.1082 2.9231 6200 1.5342 5761616
1.6868 3.0174 6400 1.5428 5948128
1.6432 3.1117 6600 1.5433 6134304
1.2227 3.2060 6800 1.5346 6319616
2.4769 3.3003 7000 1.5341 6505744
0.7089 3.3946 7200 1.5427 6692208
0.931 3.4889 7400 1.5286 6875616
1.6422 3.5832 7600 1.5338 7059472
2.4477 3.6775 7800 1.5321 7243472
2.3196 3.7718 8000 1.5354 7428048
0.9875 3.8661 8200 1.5305 7611184
0.8705 3.9604 8400 1.5299 7796112
1.726 4.0547 8600 1.5377 7979520
1.05 4.1490 8800 1.5366 8167776
0.8927 4.2433 9000 1.5298 8355856
1.4016 4.3376 9200 1.5397 8543120
1.2307 4.4319 9400 1.5363 8727088
1.3127 4.5262 9600 1.5346 8914992
1.3066 4.6205 9800 1.5394 9095040
1.7302 4.7148 10000 1.5311 9283072
1.1526 4.8091 10200 1.5369 9467600
1.8063 4.9033 10400 1.5314 9653456
2.1182 4.9976 10600 1.5371 9841232
0.6087 5.0919 10800 1.5387 10025504
0.741 5.1862 11000 1.5376 10216464
3.1286 5.2805 11200 1.5334 10402448
1.0825 5.3748 11400 1.5321 10586976
0.864 5.4691 11600 1.5356 10770896
1.7066 5.5634 11800 1.5378 10959424
1.2288 5.6577 12000 1.5421 11146816
1.8905 5.7520 12200 1.5380 11328528
1.1623 5.8463 12400 1.5419 11515600
1.6314 5.9406 12600 1.5384 11697056
0.4464 6.0349 12800 1.5378 11884336
1.1535 6.1292 13000 1.5300 12074128
0.2332 6.2235 13200 1.5386 12258064
1.6191 6.3178 13400 1.5347 12443248
0.9751 6.4121 13600 1.5278 12626480
1.8843 6.5064 13800 1.5368 12813808
1.5336 6.6007 14000 1.5303 12998256
0.8781 6.6950 14200 1.5382 13180928
1.9583 6.7893 14400 1.5305 13364368
2.0035 6.8835 14600 1.5235 13552272
1.9377 6.9778 14800 1.5285 13735904
2.7577 7.0721 15000 1.5361 13924000
2.4858 7.1664 15200 1.5375 14113184
1.8379 7.2607 15400 1.5328 14295568
0.5395 7.3550 15600 1.5352 14480560
1.4666 7.4493 15800 1.5322 14664736
2.6556 7.5436 16000 1.5277 14852128
0.8046 7.6379 16200 1.5363 15033840
1.0626 7.7322 16400 1.5361 15219136
1.4575 7.8265 16600 1.5316 15404160
2.1096 7.9208 16800 1.5287 15589632
1.2183 8.0151 17000 1.5347 15781760
0.7056 8.1094 17200 1.5361 15967648
1.0756 8.2037 17400 1.5345 16155248
1.6033 8.2980 17600 1.5316 16343648
1.2349 8.3923 17800 1.5410 16523360
1.3093 8.4866 18000 1.5324 16709008
1.3615 8.5809 18200 1.5336 16893648
0.632 8.6752 18400 1.5273 17079824
1.8603 8.7694 18600 1.5302 17265072
2.1083 8.8637 18800 1.5434 17445904
1.4074 8.9580 19000 1.5448 17631504
0.9535 9.0523 19200 1.5377 17818512
0.6912 9.1466 19400 1.5420 18005200
1.7746 9.2409 19600 1.5344 18190416
2.0576 9.3352 19800 1.5400 18373200
2.5187 9.4295 20000 1.5314 18556672
2.6635 9.5238 20200 1.5391 18742816
2.1528 9.6181 20400 1.5346 18930224
1.7716 9.7124 20600 1.5400 19115456
1.355 9.8067 20800 1.5383 19296016
0.5675 9.9010 21000 1.5383 19482416
0.974 9.9953 21200 1.5275 19668640
1.1068 10.0896 21400 1.5334 19860880
1.0851 10.1839 21600 1.5333 20052672
0.9046 10.2782 21800 1.5318 20236224
1.5511 10.3725 22000 1.5293 20421632
2.4675 10.4668 22200 1.5286 20608320
1.1624 10.5611 22400 1.5350 20788112
1.0901 10.6554 22600 1.5330 20969744
0.9 10.7496 22800 1.5276 21151648
2.3174 10.8439 23000 1.5259 21335600
1.3619 10.9382 23200 1.5372 21522352
2.7212 11.0325 23400 1.5358 21709568
1.3666 11.1268 23600 1.5326 21894592
1.3113 11.2211 23800 1.5271 22079344
1.0245 11.3154 24000 1.5336 22269152
2.2966 11.4097 24200 1.5339 22451760
1.6225 11.5040 24400 1.5459 22639312
0.5084 11.5983 24600 1.5342 22821728
0.9625 11.6926 24800 1.5335 23005696
1.3978 11.7869 25000 1.5367 23192112
1.7703 11.8812 25200 1.5308 23373840
1.3747 11.9755 25400 1.5273 23559968
1.1913 12.0698 25600 1.5376 23743680
2.9794 12.1641 25800 1.5405 23931472
0.6398 12.2584 26000 1.5336 24118800
1.9674 12.3527 26200 1.5306 24308976
0.641 12.4470 26400 1.5383 24493584
0.9799 12.5413 26600 1.5285 24679264
1.8837 12.6355 26800 1.5345 24861136
1.7178 12.7298 27000 1.5277 25046496
2.0154 12.8241 27200 1.5377 25230592
2.2121 12.9184 27400 1.5383 25411904
1.9462 13.0127 27600 1.5341 25595280
1.892 13.1070 27800 1.5425 25777696
0.7727 13.2013 28000 1.5292 25963552
0.6919 13.2956 28200 1.5313 26150464
2.4513 13.3899 28400 1.5388 26335552
1.9732 13.4842 28600 1.5330 26524096
2.0418 13.5785 28800 1.5242 26713392
0.6923 13.6728 29000 1.5337 26900464
1.8188 13.7671 29200 1.5407 27087040
1.383 13.8614 29400 1.5411 27270960
2.4062 13.9557 29600 1.5326 27457936
2.2125 14.0500 29800 1.5275 27639216
1.6523 14.1443 30000 1.5383 27829056
1.2126 14.2386 30200 1.5311 28019840
0.7117 14.3329 30400 1.5315 28205616
2.046 14.4272 30600 1.5307 28390464
0.8394 14.5215 30800 1.5439 28571424
0.5962 14.6157 31000 1.5324 28758128
1.3903 14.7100 31200 1.5316 28942096
1.9185 14.8043 31400 1.5297 29127440
0.3431 14.8986 31600 1.5316 29310016
1.5252 14.9929 31800 1.5402 29497520
2.2414 15.0872 32000 1.5419 29680160
1.1793 15.1815 32200 1.5367 29872080
2.1016 15.2758 32400 1.5358 30060048
2.7248 15.3701 32600 1.5426 30243024
1.5292 15.4644 32800 1.5275 30433968
0.88 15.5587 33000 1.5373 30617936
0.3452 15.6530 33200 1.5300 30802960
0.7676 15.7473 33400 1.5313 30985296
0.8678 15.8416 33600 1.5355 31168496
1.3862 15.9359 33800 1.5326 31350688
1.3233 16.0302 34000 1.5336 31530704
1.5892 16.1245 34200 1.5307 31718960
0.6211 16.2188 34400 1.5267 31901696
1.4859 16.3131 34600 1.5274 32092528
2.7674 16.4074 34800 1.5282 32279920
2.014 16.5017 35000 1.5344 32461952
0.8514 16.5959 35200 1.5360 32647696
1.6113 16.6902 35400 1.5324 32828656
0.26 16.7845 35600 1.5334 33016320
1.6374 16.8788 35800 1.5295 33202224
1.3047 16.9731 36000 1.5368 33385424
2.1828 17.0674 36200 1.5384 33572672
1.343 17.1617 36400 1.5347 33759120
1.7243 17.2560 36600 1.5392 33946224
1.2776 17.3503 36800 1.5344 34137504
2.3692 17.4446 37000 1.5323 34322448
1.059 17.5389 37200 1.5340 34506880
2.9969 17.6332 37400 1.5392 34692032
1.4363 17.7275 37600 1.5348 34873984
2.3506 17.8218 37800 1.5300 35058576
0.3766 17.9161 38000 1.5268 35245152
1.7634 18.0104 38200 1.5272 35431232
1.4477 18.1047 38400 1.5272 35615248
1.3816 18.1990 38600 1.5272 35798688
2.7273 18.2933 38800 1.5272 35984224
1.4794 18.3876 39000 1.5272 36168064
0.3987 18.4818 39200 1.5272 36351216
0.853 18.5761 39400 1.5272 36537456
0.8147 18.6704 39600 1.5272 36723376
0.4624 18.7647 39800 1.5272 36910256
0.7265 18.8590 40000 1.5272 37097424

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_1745950282

Adapter
(485)
this model

Dataset used to train rbelanec/train_boolq_1745950282