Training in progress, step 50

Browse files

Files changed (4) hide show

chat_2025-01-29_12-31-50.json +38 -0
chat_2025-01-29_12-33-06.json +38 -0
model.safetensors +1 -1
training_args.bin +1 -1

chat_2025-01-29_12-31-50.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+    "settings": {
+        "command": "chat",
+        "model_name_or_path": "/root/trl/dynamics/Qwen_0.5-DPO_5e-5-100ep_0alp_0lam",
+        "user": null,
+        "system_prompt": null,
+        "save_folder": "./chat_history/",
+        "device": "cpu",
+        "examples_path": null,
+        "max_new_tokens": 256,
+        "do_sample": true,
+        "num_beams": 1,
+        "temperature": 1.0,
+        "top_k": 50,
+        "top_p": 1.0,
+        "repetition_penalty": 1.0,
+        "eos_tokens": null,
+        "eos_token_ids": null,
+        "model_revision": "main",
+        "torch_dtype": null,
+        "trust_remote_code": false,
+        "attn_implementation": null,
+        "load_in_8bit": false,
+        "load_in_4bit": false,
+        "bnb_4bit_quant_type": "nf4",
+        "use_bnb_nested_quant": false
+    },
+    "chat_history": [
+        {
+            "role": "user",
+            "content": "if X^2 + 3 = 12. What is X?"
+        },
+        {
+            "role": "assistant",
+            "content": "Given the equation:\r\n\r\nX^2 + 3 = 12\r\n\r\nTo find X, we'll first make the equation easier to work with, by moving the 3 to the right side:\r\n\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 - 3 = 12\r\nX^2 + 3 & 93177175136111938711541797"
+        }
+    ]
+}

chat_2025-01-29_12-33-06.json ADDED Viewed

	@@ -0,0 +1,38 @@

+{
+    "settings": {
+        "command": "chat",
+        "model_name_or_path": "/root/trl/dynamics/Qwen_0.5-DPO_5e-5-100ep_0alp_0lam",
+        "user": null,
+        "system_prompt": null,
+        "save_folder": "./chat_history/",
+        "device": "cpu",
+        "examples_path": null,
+        "max_new_tokens": 256,
+        "do_sample": true,
+        "num_beams": 1,
+        "temperature": 1.0,
+        "top_k": 50,
+        "top_p": 1.0,
+        "repetition_penalty": 1.0,
+        "eos_tokens": null,
+        "eos_token_ids": null,
+        "model_revision": "main",
+        "torch_dtype": null,
+        "trust_remote_code": false,
+        "attn_implementation": null,
+        "load_in_8bit": false,
+        "load_in_4bit": false,
+        "bnb_4bit_quant_type": "nf4",
+        "use_bnb_nested_quant": false
+    },
+    "chat_history": [
+        {
+            "role": "user",
+            "content": "if X^2 + 3 = 12. What is X?"
+        },
+        {
+            "role": "assistant",
+            "content": "The correct form of the question would be (X)\u00b2 + 3 = 12, where (X) is the variable and 12 is the right side of the equation.\n\nTo solve the problem, we'll apply the concept of completing the square. In summary, when you apply this concept, the equation is transformed into the following form:\n\n1. (x + 3)\u00b2 = 12\n2. Apply the concept of completing the square, by adding 3 to both sides, to transform the question into the following form:\n\n1. ((x+3)\u00b2 = 12\n2. Add 3 to both sides to find the resulting equation: ((x+3)\u00b2 = 12\n3. Finally, we'll apply the concept of completing the square and subtract 3, to find the final form of the question: ((x+3)\u00b2 = 12.\n\nFinally, we'll find that X is the square root of 12, which means that its square, 12, is also a root, and that X is positive (since the problem states that X is a positive number). It is, by definition, a number greater than 1, and it allows us to make"
+        }
+    ]
+}

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:313b8a92a32258d38be615bc9a8e655e852487ae4dbfb81484b1f69d9afb13b7
 size 1260367448

 version https://git-lfs.github.com/spec/v1
+oid sha256:b8bf18ecca700448bdf4f60ee24fa9dcd9b618cea0ff7b3d3c02d291a55ce770
 size 1260367448

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:22441e636050e27d69b0fd135f9af4c4bedeb98a1dbe55fea5bbad6371786824
 size 7672

 version https://git-lfs.github.com/spec/v1
+oid sha256:b6d58aad2788b7d63d352c117a8975444dd682d1d960f74970206d669a928c86
 size 7672