mistral-7b-peptide-new-data

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.6867

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 3
  • eval_batch_size: 3
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 48
  • total_eval_batch_size: 12
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 30
  • training_steps: 8000

Training results

Training Loss Epoch Step Validation Loss
2.0847 0.0063 50 1.7965
1.5136 0.0125 100 1.4393
1.3045 0.0187 150 1.2247
1.1527 0.025 200 1.1058
1.0242 0.0312 250 0.9986
0.6946 0.0375 300 0.9598
0.5299 0.0437 350 0.9767
0.5212 0.05 400 0.9270
0.4852 0.0563 450 0.9116
0.4739 0.0625 500 0.8924
0.3936 0.0688 550 0.9344
0.3514 0.075 600 0.9703
0.3544 0.0813 650 0.9725
0.3532 0.0875 700 0.9607
0.3524 0.0938 750 0.9420
0.3586 0.1 800 0.9656
0.3171 0.1062 850 0.9821
0.3218 0.1125 900 0.9767
0.3264 0.1187 950 0.9796
0.3211 0.125 1000 0.9690
0.3279 0.1313 1050 0.9566
0.2685 0.1375 1100 1.0596
0.2714 0.1437 1150 1.0163
0.2865 0.15 1200 1.0333
0.2763 0.1562 1250 1.0295
0.2823 0.1625 1300 1.0135
0.2222 0.1688 1350 1.0825
0.2091 0.175 1400 1.0671
0.323 0.1812 1450 1.1212
0.2035 0.1875 1500 1.0661
0.1991 0.1938 1550 1.0609
0.1775 0.2 1600 1.1045
0.1624 0.2062 1650 1.1419
0.1829 0.2125 1700 1.0643
0.1879 0.2188 1750 1.1223
0.1667 0.225 1800 1.1179
0.1618 0.2313 1850 1.1347
0.1469 0.2375 1900 1.1522
0.151 0.2437 1950 1.1615
0.1609 0.25 2000 1.1471
0.1504 0.2562 2050 1.1457
0.1452 0.2625 2100 1.1527
0.1341 0.2687 2150 1.1743
0.137 0.275 2200 1.1742
0.1387 0.2812 2250 1.1652
0.1357 0.2875 2300 1.1657
0.1347 0.2938 2350 1.1545
0.1217 0.3 2400 1.1933
0.1226 0.3063 2450 1.1882
0.1273 0.3125 2500 1.1999
0.124 0.3187 2550 1.1980
0.1197 0.325 2600 1.2027
0.1134 0.3312 2650 1.2114
0.112 0.3375 2700 1.2340
0.1132 0.3438 2750 1.2302
0.1127 0.35 2800 1.2177
0.1088 0.3563 2850 1.2415
0.1022 0.3625 2900 1.2502
0.0988 0.3688 2950 1.2659
0.0998 0.375 3000 1.2661
0.1012 0.3812 3050 1.2714
0.0974 0.3875 3100 1.2615
0.0948 0.3937 3150 1.2465
0.0903 0.4 3200 1.2662
0.088 0.4062 3250 1.2820
0.0895 0.4125 3300 1.2749
0.0903 0.4188 3350 1.2479
0.0871 0.425 3400 1.2638
0.0733 0.4313 3450 1.3351
0.0735 0.4375 3500 1.3046
0.0818 0.4437 3550 1.3131
0.08 0.45 3600 1.3224
0.0795 0.4562 3650 1.3298
0.0746 0.4625 3700 1.3175
0.0706 0.4688 3750 1.3807
0.0711 0.475 3800 1.3475
0.0748 0.4813 3850 1.3502
0.0705 0.4875 3900 1.3271
0.0685 0.4938 3950 1.3551
0.0663 0.5 4000 1.3735
0.0663 0.5062 4050 1.3789
0.0654 0.5125 4100 1.3495
0.0658 0.5188 4150 1.3363
0.0633 0.525 4200 1.3569
0.0621 0.5312 4250 1.3798
0.0636 0.5375 4300 1.3904
0.0635 0.5437 4350 1.4183
0.0597 0.55 4400 1.3955
0.0574 0.5563 4450 1.3847
0.0588 0.5625 4500 1.4347
0.0575 0.5687 4550 1.4519
0.0574 0.575 4600 1.4268
0.056 0.5813 4650 1.4242
0.0535 0.5875 4700 1.4149
0.0523 0.5938 4750 1.4397
0.0463 0.6 4800 1.4837
0.0485 0.6062 4850 1.4928
0.0472 0.6125 4900 1.4878
0.0465 0.6188 4950 1.5182
0.0391 0.625 5000 1.4831
0.0389 0.6312 5050 1.4707
0.0443 0.6375 5100 1.4903
0.0367 0.6438 5150 1.5244
0.033 0.65 5200 1.4586
0.0352 0.6562 5250 1.4376
0.0353 0.6625 5300 1.5125
0.0309 0.6687 5350 1.5366
0.0273 0.675 5400 1.4890
0.0313 0.6813 5450 1.5407
0.0243 0.6875 5500 1.5580
0.0259 0.6937 5550 1.5675
0.0247 0.7 5600 1.5824
0.0212 0.7063 5650 1.5901
0.0228 0.7125 5700 1.5499
0.0248 0.7188 5750 1.5870
0.0252 0.725 5800 1.5419
0.0177 0.7312 5850 1.5714
0.0239 0.7375 5900 1.5993
0.0252 0.7438 5950 1.5668
0.0243 0.75 6000 1.5898
0.0219 0.7562 6050 1.5875
0.0208 0.7625 6100 1.5930
0.0245 0.7688 6150 1.5847
0.0216 0.775 6200 1.6443
0.0222 0.7812 6250 1.6116
0.0175 0.7875 6300 1.6632
0.0211 0.7937 6350 1.6293
0.0218 0.8 6400 1.6341
0.0212 0.8063 6450 1.6336
0.0198 0.8125 6500 1.6720
0.0217 0.8187 6550 1.6364
0.0211 0.825 6600 1.6325
0.0196 0.8313 6650 1.6860
0.0231 0.8375 6700 1.6489
0.0216 0.8438 6750 1.6443
0.0229 0.85 6800 1.6406
0.0204 0.8562 6850 1.6545
0.0219 0.8625 6900 1.6468
0.0235 0.8688 6950 1.6207
0.022 0.875 7000 1.6522
0.0188 0.8812 7050 1.6853
0.0204 0.8875 7100 1.6584
0.0197 0.8938 7150 1.6843
0.0208 0.9 7200 1.7061
0.0205 0.9062 7250 1.6769
0.0235 0.9125 7300 1.6619
0.0198 0.9187 7350 1.6702
0.0216 0.925 7400 1.6880
0.0221 0.9313 7450 1.6701
0.0224 0.9375 7500 1.6614
0.0193 0.9437 7550 1.6734
0.0208 0.95 7600 1.6836
0.0199 0.9563 7650 1.6981
0.0223 0.9625 7700 1.6915
0.0184 0.9688 7750 1.6524
0.018 0.975 7800 1.7108
0.0184 0.9812 7850 1.6707
0.022 0.9875 7900 1.6735
0.0229 0.9938 7950 1.6733
0.0222 1.0 8000 1.6867

Framework versions

  • Transformers 4.44.0.dev0
  • Pytorch 2.1.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
0
Safetensors
Model size
266k params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.