GSAI-ML/LLaDA-8B-Instruct · Question about the chat template which ignores add_generation

When trying to format multi-round sft data, I find that the chat template of the tokenizer seems not to handle the optional parameter add_generation_prompt, and always adds the prefix for the next round of assistant response after formatting all the messages. This means if I use the apply_chat_template from the tokenizer to format a data point, there will always be an extra generation prompt even if the conservation is finished and the last message is the last response from the model. This issue may introduce extra tokens in the inputs of the sft data, which in fact should be only some EOS.

Like the example above, this is the end of the sft data point but an extra suffix is appended.

I think the problem is that the tokenizer's template just ignores the parameter and treat it as True:

Compared to that of the Qwen model here:

Not sure if my observation is correct, so it would be nice of you to have a look at this, thx.
This is not a big issue but maybe a little bit confusing when trying to use the chat template for data formatting :)

GSAI-ML
/

LLaDA-8B-Instruct

Question about the chat template which ignores add_generation_prompt