legal-mcq-gemma-2b

This model is a fine-tuned version of google/gemma-2-2b-it on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0430

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 4
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.PAGED_ADAMW_8BIT with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss
1.6312 0.1348 50 1.4909
1.8023 0.2695 100 1.7316
1.8044 0.4043 150 1.7173
1.6679 0.5391 200 1.1123
1.3583 0.6739 250 0.8995
1.1691 0.8086 300 0.6657
0.9808 0.9434 350 0.7485
0.605 1.0782 400 0.5601
0.5032 1.2129 450 0.4895
0.4466 1.3477 500 0.4132
0.4158 1.4825 550 0.3070
0.3581 1.6173 600 0.2680
0.3132 1.7520 650 0.2225
0.2682 1.8868 700 0.1625
0.2197 2.0216 750 0.1424
0.1341 2.1563 800 0.1202
0.1058 2.2911 850 0.1032
0.1038 2.4259 900 0.0847
0.0784 2.5606 950 0.0702
0.0764 2.6954 1000 0.0529
0.0737 2.8302 1050 0.0451
0.062 2.9650 1100 0.0430

Framework versions

  • PEFT 0.16.0
  • Transformers 4.54.0
  • Pytorch 2.6.0+cu124
  • Datasets 4.0.0
  • Tokenizers 0.21.2
Downloads last month
4
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for jonathanagustin/legal-mcq-gemma-2b

Base model

google/gemma-2-2b
Adapter
(249)
this model