adhi29/openhermes-mistral-dpo-gptq

Browse files

Files changed (5) hide show

README.md +13 -13
adapter_config.json +2 -2
adapter_model.safetensors +1 -1
runs/Jan10_00-42-00_b673d1c2513e/events.out.tfevents.1704847405.b673d1c2513e.828.0 +3 -0
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.9866
-- Rewards/chosen: 0.2715
-- Rewards/rejected: 0.5084
-- Rewards/accuracies: 0.625
-- Rewards/margins: -0.2369
-- Logps/rejected: -217.0748
-- Logps/chosen: -192.7873
-- Logits/rejected: -2.1497
-- Logits/chosen: -2.0212
 ## Model description
@@ -58,10 +58,10 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
-| 0.6752        | 0.01  | 10   | 0.7338          | 0.0443         | 0.0693           | 0.875              | -0.0250         | -221.4665      | -195.0593    | -2.1454         | -2.0106       |
-| 0.71          | 0.01  | 20   | 0.7099          | 0.0825         | 0.0676           | 0.875              | 0.0149          | -221.4828      | -194.6768    | -2.1435         | -2.0127       |
-| 0.6938        | 0.01  | 30   | 0.8421          | 0.1926         | 0.3222           | 0.625              | -0.1296         | -218.9368      | -193.5758    | -2.1482         | -2.0177       |
-| 0.6923        | 0.02  | 40   | 0.9866          | 0.2715         | 0.5084           | 0.625              | -0.2369         | -217.0748      | -192.7873    | -2.1497         | -2.0212       |
 ### Framework versions

 This model is a fine-tuned version of [TheBloke/OpenHermes-2-Mistral-7B-GPTQ](https://huggingface.co/TheBloke/OpenHermes-2-Mistral-7B-GPTQ) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.6796
+- Rewards/chosen: 0.0825
+- Rewards/rejected: 0.0991
+- Rewards/accuracies: 0.375
+- Rewards/margins: -0.0166
+- Logps/rejected: -111.4488
+- Logps/chosen: -104.0037
+- Logits/rejected: -1.8100
+- Logits/chosen: -1.8966
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
 |:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6782        | 0.01  | 10   | 0.6854          | 0.0505         | 0.0332           | 0.5625             | 0.0173          | -112.1082      | -104.3235    | -1.7988         | -1.8929       |
+| 0.7064        | 0.01  | 20   | 0.6812          | 0.0509         | 0.0279           | 0.8125             | 0.0230          | -112.1610      | -104.3192    | -1.8032         | -1.8968       |
+| 0.7024        | 0.01  | 30   | 0.6820          | 0.0697         | 0.0728           | 0.375              | -0.0031         | -111.7118      | -104.1311    | -1.8068         | -1.8953       |
+| 0.6946        | 0.02  | 40   | 0.6796          | 0.0825         | 0.0991           | 0.375              | -0.0166         | -111.4488      | -104.0037    | -1.8100         | -1.8966       |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -19,8 +19,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
-    "q_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
+    "v_proj"
   ],
   "task_type": "CAUSAL_LM"
 }

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:c98da1863b9dadf52f7f528d729e4c15130e1da81c318696bd8b3a409507ab5b
 size 13648432

 version https://git-lfs.github.com/spec/v1
+oid sha256:2ff3ec2e748f617f9b599dcfbaa9e13457ed4e402b5b4de32ed2344602b4885b
 size 13648432

runs/Jan10_00-42-00_b673d1c2513e/events.out.tfevents.1704847405.b673d1c2513e.828.0 ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:8185520792eee8bae18dfc092e376de97f928596edf97989af18d6c0326c5551
+size 11244

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1b99c58734a5e5e9083464ec5d565517cbaf9a7e4545a53ff8e9a45bab558ed9
 size 4155

 version https://git-lfs.github.com/spec/v1
+oid sha256:943e7cfaaad8feff68d549e45309810be3943d1389b7dc677ce98d17b9f3b5ad
 size 4155