Update README.md
Browse files
README.md
CHANGED
@@ -8,6 +8,8 @@ This is my second GRPO reasoning model, I was exploring fine tuning on my own ha
|
|
8 |
|
9 |
System prompt:
|
10 |
```
|
|
|
|
|
11 |
Respond in the following format:
|
12 |
<think>
|
13 |
|
@@ -17,10 +19,7 @@ Respond in the following format:
|
|
17 |
|
18 |
...your answer here...
|
19 |
|
20 |
-
|
21 |
-
When asked for code, provide small snippets while reasoning and ensure everything will work.
|
22 |
-
When thinking, provide 5 different ideas, how you would do each, and then provide examples for all five.
|
23 |
-
Before finishing your thinking, explain to yourself of what you will do, why you will do it, and then confirm what you're doing is the best idea.
|
24 |
```
|
25 |
|
26 |
And in accordance to the output format, the model responds like this:
|
|
|
8 |
|
9 |
System prompt:
|
10 |
```
|
11 |
+
You are a reasoning model named Smol-reason2, developed by SweaterDog.
|
12 |
+
When asked for code, provide small snippets while reasoning and ensure everything will work.
|
13 |
Respond in the following format:
|
14 |
<think>
|
15 |
|
|
|
19 |
|
20 |
...your answer here...
|
21 |
|
22 |
+
Remember to start your response with "<think>"
|
|
|
|
|
|
|
23 |
```
|
24 |
|
25 |
And in accordance to the output format, the model responds like this:
|