Edit model card

Llama-2-7b-spin-rephrased-10k

This model is a fine-tuned version of meta-llama/Llama-2-7b-hf on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1071
  • Rewards/real: 10.2171
  • Rewards/generated: -7.6243
  • Rewards/accuracies: 1.0
  • Rewards/margins: 17.8413
  • Logps/generated: -358.9117
  • Logps/real: -104.6875
  • Logits/generated: -0.8781
  • Logits/real: -1.4494

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-07
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 32
  • total_eval_batch_size: 16
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss Rewards/real Rewards/generated Rewards/accuracies Rewards/margins Logps/generated Logps/real Logits/generated Logits/real
0.1687 0.1984 62 0.1554 5.2053 -5.2548 1.0 10.4601 -335.2168 -154.8048 -0.7218 -0.4019
0.1204 0.3968 124 0.1153 9.3697 -4.5235 1.0 13.8932 -327.9041 -113.1613 -0.8262 -1.1627
0.1114 0.5952 186 0.1125 9.6740 -5.3166 1.0 14.9906 -335.8354 -110.1185 -0.8446 -1.2393
0.1094 0.7936 248 0.1110 9.8335 -5.4853 1.0 15.3188 -337.5219 -108.5231 -0.8538 -1.2560
0.1115 0.992 310 0.1100 9.9127 -6.4827 1.0 16.3954 -347.4966 -107.7317 -0.8658 -1.3304
0.1046 1.1904 372 0.1093 9.9819 -6.6707 1.0 16.6526 -349.3765 -107.0395 -0.8656 -1.3633
0.1067 1.3888 434 0.1089 10.0127 -7.5740 1.0 17.5868 -358.4094 -106.7308 -0.8814 -1.3898
0.1038 1.5872 496 0.1083 10.0730 -7.0038 1.0 17.0768 -352.7069 -106.1281 -0.8755 -1.3615
0.0996 1.7856 558 0.1079 10.1219 -7.0176 1.0 17.1396 -352.8456 -105.6391 -0.8467 -1.3431
0.1058 1.984 620 0.1077 10.1479 -7.4808 1.0 17.6287 -357.4770 -105.3797 -0.8821 -1.4055
0.0995 2.1824 682 0.1074 10.1669 -7.1947 1.0 17.3617 -354.6166 -105.1890 -0.8781 -1.4102
0.1017 2.3808 744 0.1073 10.1849 -7.6243 1.0 17.8092 -358.9117 -105.0093 -0.8806 -1.4228
0.1031 2.5792 806 0.1072 10.2106 -7.6581 1.0 17.8687 -359.2500 -104.7519 -0.8787 -1.4391
0.1025 2.7776 868 0.1071 10.2105 -7.6804 1.0 17.8909 -359.4730 -104.7534 -0.8824 -1.4506
0.1067 2.976 930 0.1071 10.2171 -7.6243 1.0 17.8413 -358.9117 -104.6875 -0.8781 -1.4494

Framework versions

  • Transformers 4.43.3
  • Pytorch 2.2.2+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1
Downloads last month
16
Safetensors
Model size
6.74B params
Tensor type
BF16
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for AmberYifan/Llama-2-7b-spin-rephrased-10k

Finetuned
(589)
this model