train_boolq_1745950274

This model is a fine-tuned version of google/gemma-3-1b-it on the boolq dataset. It achieves the following results on the evaluation set:

  • Loss: 2.9271
  • Num Input Tokens Seen: 34633072

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 123
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 4
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 40000

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
2.6453 0.0943 200 3.2031 174096
3.5809 0.1886 400 3.1503 344560
3.5279 0.2829 600 3.1043 517536
2.8551 0.3772 800 3.0762 696016
2.7868 0.4715 1000 3.0509 868992
2.6766 0.5658 1200 3.0614 1040544
3.84 0.6601 1400 3.0431 1211680
2.3861 0.7544 1600 3.0519 1381792
3.4698 0.8487 1800 3.0226 1559456
2.9566 0.9430 2000 3.0275 1735840
2.8968 1.0372 2200 3.0316 1910848
3.1856 1.1315 2400 3.0049 2081696
3.4134 1.2258 2600 3.0010 2255952
2.9332 1.3201 2800 3.0122 2427152
2.9351 1.4144 3000 3.0129 2601296
2.5434 1.5087 3200 2.9970 2774672
3.1122 1.6030 3400 2.9784 2944896
2.7179 1.6973 3600 2.9600 3117216
2.8625 1.7916 3800 2.9709 3287952
2.7659 1.8859 4000 2.9735 3464640
2.9328 1.9802 4200 2.9566 3638880
3.0105 2.0745 4400 2.9613 3812624
3.4794 2.1688 4600 2.9646 3986544
2.7816 2.2631 4800 2.9783 4158272
3.1668 2.3574 5000 2.9668 4328240
3.0955 2.4517 5200 2.9730 4507760
2.8676 2.5460 5400 2.9506 4681664
3.3803 2.6403 5600 2.9642 4856928
3.1192 2.7346 5800 2.9736 5024976
3.3355 2.8289 6000 2.9883 5202368
3.3114 2.9231 6200 2.9882 5377360
2.1893 3.0174 6400 2.9834 5550480
3.533 3.1117 6600 2.9786 5724080
2.7837 3.2060 6800 2.9849 5896688
3.127 3.3003 7000 2.9557 6070544
2.7459 3.3946 7200 2.9538 6244624
3.2816 3.4889 7400 2.9563 6416176
3.3454 3.5832 7600 2.9709 6587616
2.7435 3.6775 7800 2.9540 6759696
3.1908 3.7718 8000 2.9497 6932384
2.6196 3.8661 8200 2.9479 7103328
3.1352 3.9604 8400 2.9728 7276304
2.9739 4.0547 8600 2.9485 7448112
2.5534 4.1490 8800 2.9555 7623632
2.5543 4.2433 9000 2.9271 7799248
2.5284 4.3376 9200 2.9455 7974368
2.8538 4.4319 9400 2.9379 8146384
2.7943 4.5262 9600 2.9336 8321456
2.4081 4.6205 9800 2.9336 8490096
2.7996 4.7148 10000 2.9381 8665904
2.3661 4.8091 10200 2.9455 8837712
2.966 4.9033 10400 2.9509 9010400
3.1711 4.9976 10600 2.9699 9185584
3.311 5.0919 10800 2.9491 9358160
2.4932 5.1862 11000 2.9626 9535520
3.3139 5.2805 11200 2.9806 9709232
2.7371 5.3748 11400 2.9771 9880896
2.9514 5.4691 11600 2.9830 10053056
3.3724 5.5634 11800 2.9859 10229152
3.1756 5.6577 12000 2.9848 10404384
2.9306 5.7520 12200 2.9874 10573872
2.9906 5.8463 12400 2.9814 10748304
3.0223 5.9406 12600 2.9898 10917920
3.4233 6.0349 12800 2.9858 11092736
2.554 6.1292 13000 2.9839 11269264
3.5816 6.2235 13200 2.9827 11441120
2.9904 6.3178 13400 2.9840 11614176
2.3922 6.4121 13600 2.9881 11785424
3.0193 6.5064 13800 2.9901 11960752
2.663 6.6007 14000 2.9775 12132672
3.3592 6.6950 14200 2.9821 12303424
3.1617 6.7893 14400 2.9830 12474592
3.1247 6.8835 14600 2.9735 12649424
2.4094 6.9778 14800 2.9854 12821280
2.8975 7.0721 15000 2.9798 12996208
3.3305 7.1664 15200 2.9779 13172592
2.8335 7.2607 15400 2.9754 13342864
3.2162 7.3550 15600 2.9741 13515600
3.1557 7.4493 15800 2.9858 13688640
3.123 7.5436 16000 2.9802 13863312
3.1461 7.6379 16200 2.9744 14032992
2.5753 7.7322 16400 2.9744 14205936
3.0835 7.8265 16600 2.9788 14378336
2.9754 7.9208 16800 2.9861 14551456
2.7244 8.0151 17000 2.9781 14730672
3.4109 8.1094 17200 2.9844 14904544
2.7873 8.2037 17400 2.9804 15078832
3.263 8.2980 17600 2.9827 15254544
2.6633 8.3923 17800 2.9717 15422256
2.6194 8.4866 18000 2.9880 15595776
2.8025 8.5809 18200 2.9845 15768288
3.2739 8.6752 18400 2.9862 15941776
3.0337 8.7694 18600 2.9897 16115152
3.0608 8.8637 18800 2.9865 16284384
3.5312 8.9580 19000 2.9885 16457552
2.6771 9.0523 19200 2.9896 16632272
2.8448 9.1466 19400 2.9859 16806304
3.4979 9.2409 19600 2.9860 16979072
3.4671 9.3352 19800 2.9841 17150160
3.6682 9.4295 20000 2.9850 17321280
2.8798 9.5238 20200 2.9839 17495488
3.5262 9.6181 20400 2.9818 17670576
3.3741 9.7124 20600 2.9857 17843440
2.8687 9.8067 20800 2.9815 18012496
2.7849 9.9010 21000 2.9824 18186480
2.7368 9.9953 21200 2.9828 18360368
2.6571 10.0896 21400 2.9818 18539664
2.6093 10.1839 21600 2.9837 18718016
2.8979 10.2782 21800 2.9838 18888560
2.3822 10.3725 22000 2.9835 19061328
2.8941 10.4668 22200 2.9847 19236176
2.2785 10.5611 22400 2.9793 19404288
2.7086 10.6554 22600 2.9829 19574224
3.0499 10.7496 22800 2.9829 19744496
2.5357 10.8439 23000 2.9834 19915984
2.8058 10.9382 23200 2.9827 20090944
3.2345 11.0325 23400 2.9843 20264992
2.6316 11.1268 23600 2.9810 20437952
3.344 11.2211 23800 2.9823 20611040
3.0959 11.3154 24000 2.9831 20787488
3.3262 11.4097 24200 2.9829 20958240
3.9468 11.5040 24400 2.9828 21133392
2.874 11.5983 24600 2.9810 21303360
2.9608 11.6926 24800 2.9846 21475184
2.9467 11.7869 25000 2.9840 21649744
2.8529 11.8812 25200 2.9841 21819728
3.0579 11.9755 25400 2.9836 21993120
2.9273 12.0698 25600 2.9827 22164624
3.3136 12.1641 25800 2.9837 22340064
2.507 12.2584 26000 2.9838 22515088
2.7376 12.3527 26200 2.9842 22692240
2.3293 12.4470 26400 2.9816 22864512
3.2821 12.5413 26600 2.9816 23037568
2.8383 12.6355 26800 2.9818 23207936
2.491 12.7298 27000 2.9850 23381376
2.7425 12.8241 27200 2.9821 23553008
3.0866 12.9184 27400 2.9818 23722608
3.0738 13.0127 27600 2.9836 23892928
2.6363 13.1070 27800 2.9819 24063632
3.15 13.2013 28000 2.9816 24237248
2.9501 13.2956 28200 2.9806 24411712
3.1561 13.3899 28400 2.9818 24584800
3.0268 13.4842 28600 2.9812 24759888
2.6915 13.5785 28800 2.9827 24936720
2.1768 13.6728 29000 2.9816 25110864
2.769 13.7671 29200 2.9800 25284944
3.4771 13.8614 29400 2.9812 25456816
3.0152 13.9557 29600 2.9807 25631728
2.8072 14.0500 29800 2.9815 25801056
3.4366 14.1443 30000 2.9832 25978896
3.0998 14.2386 30200 2.9835 26156672
2.6254 14.3329 30400 2.9815 26330592
3.1786 14.4272 30600 2.9817 26502800
2.803 14.5215 30800 2.9820 26671584
2.881 14.6157 31000 2.9818 26845568
3.4202 14.7100 31200 2.9797 27017952
2.9903 14.8043 31400 2.9816 27191600
2.9173 14.8986 31600 2.9821 27362144
2.5902 14.9929 31800 2.9826 27536992
3.4131 15.0872 32000 2.9853 27707728
3.1571 15.1815 32200 2.9846 27886368
4.0238 15.2758 32400 2.9850 28061984
2.7433 15.3701 32600 2.9855 28233360
2.9622 15.4644 32800 2.9850 28411200
2.9855 15.5587 33000 2.9855 28582944
2.8146 15.6530 33200 2.9850 28756240
2.1957 15.7473 33400 2.9850 28926208
3.0656 15.8416 33600 2.9850 29096816
2.781 15.9359 33800 2.9844 29267072
2.8638 16.0302 34000 2.9844 29435360
3.3154 16.1245 34200 2.9844 29610720
2.351 16.2188 34400 2.9844 29781472
2.8905 16.3131 34600 2.9844 29959568
2.4058 16.4074 34800 2.9844 30134704
3.0833 16.5017 35000 2.9844 30305200
3.0284 16.5959 35200 2.9844 30478576
2.7625 16.6902 35400 2.9844 30647744
2.5376 16.7845 35600 2.9844 30823072
3.3257 16.8788 35800 2.9844 30996032
2.8614 16.9731 36000 2.9844 31167328
2.8231 17.0674 36200 2.9844 31341392
2.4818 17.1617 36400 2.9844 31515648
3.5857 17.2560 36600 2.9844 31690208
2.8301 17.3503 36800 2.9844 31868288
2.6124 17.4446 37000 2.9844 32041536
3.5769 17.5389 37200 2.9844 32213584
3.0379 17.6332 37400 2.9844 32385920
2.76 17.7275 37600 2.9844 32555856
3.1422 17.8218 37800 2.9844 32729024
2.7946 17.9161 38000 2.9844 32902832
3.2047 18.0104 38200 2.9844 33076912
2.4714 18.1047 38400 2.9844 33248832
2.6947 18.1990 38600 2.9844 33420800
3.2644 18.2933 38800 2.9844 33594000
2.9905 18.3876 39000 2.9844 33765936
2.7909 18.4818 39200 2.9844 33936896
2.7959 18.5761 39400 2.9844 34110592
2.8925 18.6704 39600 2.9844 34284208
3.0191 18.7647 39800 2.9844 34458576
3.5334 18.8590 40000 2.9844 34633072

Framework versions

  • PEFT 0.15.2.dev0
  • Transformers 4.51.3
  • Pytorch 2.6.0+cu124
  • Datasets 3.5.0
  • Tokenizers 0.21.1
Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rbelanec/train_boolq_1745950274

Adapter
(142)
this model

Dataset used to train rbelanec/train_boolq_1745950274