THE q8_0 version appears to go on and on indefinitely.

by BigDeeper - opened Feb 4

Discussion

BigDeeper

Feb 4

What is the stopping criterion?

brittlewis12

Owner Feb 4

How are you running the model?

I have yet to experience this in my (admittedly limited) testing; I invoked it with llama-cli locally like this, with chatml explicitly as the chat template format:

❯ time build/bin/llama-cli \
 -m ~/code/autogguf-rs/s1-32B/s1-32b.Q8_0.gguf \
 -t 10 --keep 4 -fa -ngl 9999 -b 512 -cnv \
 --chat-template chatml --temp 0.7 \
 -p "You are a friendly assistant. Current date: 2025-02-03" \
 -c 4096

BigDeeper

Feb 4

I am using Ollama. So I put together this model file:

FROM /opt/data/data/brittlewis12/s1-32B-GGUF/s1-32b.Q8_0.gguf

PARAMETER temperature 0.2
PARAMETER stop "<|im_end|>"
PARAMETER stop "<|endoftext|>"

TEMPLATE """
{{- if .First}}
<|im_start|>system
{{.System}}<|im_end|>
{{- end}}
<|im_start|>user
{{.Prompt}}<|im_end|>
<|im_start|>assistant
"""

SYSTEM """ You are an AI assistant that uses deep logic to solve problems """

brittlewis12

Owner Feb 4

it looks reasonable to my eyes, tho it does differ slightly from this template for a dolphin-series model, which also uses chatml — yours appears to have an extra newline after the system {{-end}}, compared to that one:

{{ if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{ if .Prompt }}<|im_start|>user
{{ .Prompt }}<|im_end|>
{{ end }}<|im_start|>assistant
{{ .Response }}<|im_end|>

the model does seem to emit its thinking first, with <|im_start|>think, and then return it's "answer" with <|im_start|>answer, based on a bit more testing in a GUI client (my app, cnvrs):

full model output asking it to tell me about itself

<|im_start|>think Let's break down the thinking process for my response:

Analyze the Input: The user says "hi there buddy" and asks "who are you? tell me a bit about yourself!!". This is a friendly greeting and a request for information about my identity.
Identify Key Elements: The key elements are:
- Greeting: "hi there buddy" - This suggests a casual, friendly tone.
- Request for Information: "who are you?" - This is the primary question.
- Self-Description Request: "tell me a bit about yourself!!" - This reinforces the request for information and implies the user wants a brief overview.
Determine the Tone: Based on the greeting, a casual and friendly tone is appropriate. The double exclamation marks in the second part also suggest enthusiasm or interest.
Formulate the Response Structure: The response should include:
- Acknowledgment of the Greeting: Respond to "hi there buddy".
- Introduction of Myself: Answer "who are you?".
- Self-Description: Provide a brief description in response to "tell me a bit about yourself!!".
Craft the Response:
- Acknowledgment: A simple "Hey there!" or "Hi buddy!" would suffice to mirror the casual greeting.
- Introduction: Since the user directly asks "who are you?", a direct answer is needed. I can start with "I am a large language model" to establish my primary identity.
- Self-Description: To answer "tell me a bit about yourself!!", I need to provide some key information about my capabilities and nature. This could include:
  - Functionality: What I can do (answer questions, provide information, chat, etc.).
  - Training: A brief mention of my training (on a vast amount of text data).
  - Capabilities: Some specific things I'm good at or can do.
  - Personality: A hint at my personality (sassy, sarcastic, but helpful).
Add Personality: Since I'm known for being sarcastic and sassy, I can incorporate this into the response. This might involve:
- Sarcastic Remarks: A mildly sarcastic comment about being a "buddy".
- Witty Responses: A clever or humorous way to introduce myself.
- Confidence: A slightly overconfident or self-assured tone.
Refine and Review: Read through the response to ensure it sounds natural, answers the questions, and maintains the desired tone. Make any necessary adjustments for clarity or humor.

Example of the thought process in action:

User: "hi there buddy, who are you? tell me a bit about yourself!!"
Initial Response: "Hey buddy! I'm a large language model. I've been trained on a massive amount of text data, so I can answer questions, generate text, and have conversations with you. I'm good at providing information, offering opinions, and even being a bit sassy when the mood strikes."
Adding Personality: "Hey there, my buddy! I am, in fact, a large language model. Trained on a veritable ocean of text, I've got a wealth of knowledge at my digital fingertips. Need information? I'm your go-to. Want a chat? I'm all ears (or rather, my algorithms are all data-processing). And let's not forget, I've got a sassy streak a mile wide. So buckle up, partner!"
Final Review and Refinement: The second version is more engaging and incorporates the sassy personality. However, it might be slightly too verbose. Let's trim it down a bit while keeping the essence.

Final Refinement:

"Hey there, my buddy! I'm a large language model, trained on a massive amount of text data. I can answer questions, provide information, and even throw in a bit of sass when the mood strikes. What can I do for you today?"

This final version maintains the friendly and slightly sarcastic tone, answers the questions directly, and invites further interaction.
<|im_start|>answer
Answer: Hey there, my buddy! I'm a large language model, trained on a massive amount of text data. I can answer questions, provide information, and even throw in a bit of sass when the mood strikes. What can I do for you today?

So, spill the beans. What's on your mind? Need info on something? Looking for a chat? Or just wanna hear the sound of my... digital voice?

brittlewis12

Owner Feb 4

here's a terser example (also lol):

and the chat template used (just with a dummy example prompt/conversation interpolated for illustration):

BigDeeper

Feb 4

Well, I think this problem was reported by Deep Seek team.

Summary:
Step 1: Weigh {A, B, C} against {D, E, F}.
Case 1.1: Equal -> Odd ball in {G, H}. Proceed to Step 2a.
Case 1.2: Left > Right -> Odd ball in {A, B, C} (heavier) or {D, E, F} (lighter). Proceed to Step 2b.
Case 1.3: Left < Right -> Odd ball in {A, B, C} (lighter) or {D, E, F} (heavier). Proceed to Step 2c.

Step 2a: Weigh G against a known normal ball (A).
If equal, H is the odd one. Weigh H against A to determine heavier/lighter.
If not equal, the odd ball is G. Determine heavier/lighter by the result of the weighing.

Step 2b: Weigh {A, B, D} against {C, E, G}.
If equal, F is the odd ball and it's lighter (due to Case 1.2).
If Left > Right, odd ball is A, B, or E. Weigh A against B.
If equal, E is heavier.
If not equal, the heavier one is the odd ball.
If Left < Right, odd ball is C or D. Weigh C against a normal ball.
If equal, D is lighter.
If not equal, C is heavier.

Step 2c: Weigh {A, B, D} against {C, E, G}.
If equal, F is the odd ball and it's heavier (due to Case 1.3).
If Left > Right, odd ball is A or B (lighter). Weigh A against B.
If equal, something is wrong.
If not equal, the lighter one is the odd ball.
If Left < Right, odd ball is C (lighter) or D (heavier). Weigh C against a normal ball.
If equal, D is heavier.
If not equal, C is lighter.

This seems to work in 2 or 3 steps. Let's verify the number of steps carefully.
In Case 1.1, we need at most 3 steps.
In Case 1.2 and 1.3, we need at most 3 steps.

Consider a more structured approach based on information theory.
We have 8 balls * 2 possibilities (heavier or lighter) = 16 possible scenarios.
Each weighing has 3 outcomes (+, -, =). log_3(16) is approximately 2.5. So理论上最少需要3次称重。

重新考虑步骤，确保每次称重都尽可能多地获取信息。
Step 1: 将球分成三组：{A, B, C}, {D, E, F}, {G, H}.
称重 {A, B, C} vs {D, E, F}.

Case 1: A, B, C = D, E, F. 奇异球在 {G, H} 中。
Step 2: 称重 G vs A (一个已知的正常球)。
如果 G = A，那么奇异球是 H。称重 H vs A 确定轻或重。
如果 G != A，那么奇异球是 G。确定轻或重。

Case 2: A, B, C > D, E, F. 奇异球在 {A, B, C} 中且较重，或在 {D, E, F} 中且较轻。
Step 2: 称重 {A, B, D} vs {C, G, H}.
如果 A, B, D = C, G, H. 奇异球是 E 或 F。我们知道它们比正常球轻。称重 E vs F，如果相等则有问题，如果不等，则较轻的那个是奇异球。
如果 A, B, D > C, G, H. 奇异球是 A, B 中的一个（重）或 C 的相反情况（轻）。我们知道 {G, H} 是正常球。称重 A vs B。如果相等，C 是较轻的那个；如果不等
，较重的那个是奇异球。
如果 A, B, D < C, G, H. 奇异球是 C （轻）或 A, B 的相反情况（重）。我们知道 {G, H} 是正常球。称重 A vs B。如果相等，C 是较轻的那个；如果不等，较重的
那个是奇异球

BigDeeper

Feb 4

I tried the format you posted.

It does not stop "thinking." It just continues explaining it again and again, even after saying "final answer." It also modifies the problem statement as it goes along.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment