Jiacheng Guo
Enable LFS for large files and add changes
fa378ec
metadata
library_name: transformers
base_model: /scratch/gpfs/jg9904/saved_models/Mistral-7B-Instruct-v0.3
tags:
  - alignment-handbook
  - generated_from_trainer
datasets:
  - >-
    /scratch/gpfs/jg9904/unintentional-unalignment/data_files/data-mistral-7b-instruct-sppo-iter1/90_variance
model-index:
  - name: mistral-dpo-lr-5.0e-7-beta-0.01
    results: []

mistral-dpo-lr-5.0e-7-beta-0.01

This model is a fine-tuned version of /scratch/gpfs/jg9904/saved_models/Mistral-7B-Instruct-v0.3 on the /scratch/gpfs/jg9904/unintentional-unalignment/data_files/data-mistral-7b-instruct-sppo-iter1/90_variance dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4177
  • Rewards/chosen: -0.6130
  • Rewards/rejected: -1.5691
  • Rewards/accuracies: 0.8741
  • Rewards/margins: 0.9561
  • Logps/rejected: -497.8502
  • Logps/chosen: -361.6783
  • Logits/rejected All: -2.7429
  • Logits/chosen All: -2.7344
  • Logits/rejected Sum: 8021.8809
  • Logits/chosen Sum: 8575.8633
  • Logits/rejected Avg: 21.7435
  • Logits/chosen Avg: 21.2499
  • Gradient/inner Product: 1124073472.0
  • Gradient/nabla Chosen Logps: 37120.0
  • Gradient/nabla Rejected Logps: 45824.0
  • Gradient/correlation: 0.4668

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 8
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 64
  • total_eval_batch_size: 32
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss Rewards/chosen Rewards/rejected Rewards/accuracies Rewards/margins Logps/rejected Logps/chosen Logits/rejected All Logits/chosen All Logits/rejected Sum Logits/chosen Sum Logits/rejected Avg Logits/chosen Avg Gradient/inner Product Gradient/nabla Chosen Logps Gradient/nabla Rejected Logps Gradient/correlation
No log 0 0 0.6931 0.0 0.0 0.0 0.0 -340.9357 -300.3766 -2.8879 -2.8885 7277.5859 7814.5264 19.8154 19.5578 77594624.0 16256.0 17152.0 0.2236
0.6346 0.3788 100 0.5216 -0.5880 -1.1850 0.7396 0.5970 -459.4361 -359.1805 -2.8931 -2.9041 7889.0952 8463.7900 21.4027 21.0064 562036736.0 34304.0 41984.0 0.4043
0.6253 0.7576 200 0.4177 -0.6130 -1.5691 0.8741 0.9561 -497.8502 -361.6783 -2.7429 -2.7344 8021.8809 8575.8633 21.7435 21.2499 1124073472.0 37120.0 45824.0 0.4668

Framework versions

  • Transformers 4.45.0
  • Pytorch 2.5.1+cu124
  • Datasets 2.14.6
  • Tokenizers 0.20.4