fix: add missing <think> tag to assistant messages

#81
by jybsuper - opened

Fix chat template to prepend <think> when assistant messages contain </think> but not <think>

When add_generation_prompt=True, the template adds <|im_start|>assistant\n<think>\n as the generation prompt. This means the model will not output the opening <think> tag itself, only the thinking content followed by .

However, when applying the chat template to existing conversations, assistant messages that were generated this way would be missing their opening <think> tag, causing template inconsistency.

This fix adds logic to detect when an assistant message contains </think> but not <think>, and automatically prepends <think>\n to maintain consistency.

Example:

Given the following messages:

[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "What is 2+2?"}
]

The prompt would be:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think>

The model would respond with something like:

The user wants to do arithmetic calculation
</think>
The answer is 4.

The full conversation history would then be:

[
  {"role": "system", "content": "You are a helpful assistant."},
  {"role": "user", "content": "What is 2+2?"},
  {"role": "assistant", "content": "The user wants to do arithmetic calculation\n</think>\nThe answer is 4."}
]

Before fix:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
The user wants to do arithmetic calculation
</think>
The answer is 4.<|im_end|>

After this fix:

<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
What is 2+2?<|im_end|>
<|im_start|>assistant
<think>
The user wants to do arithmetic calculation
</think>
The answer is 4.<|im_end|>

This fix is required for correct loss maksing in multiturn SFT and RL.

Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment