Possible issue with the new tokenizer config chat template

#1
by OldKingMeister - opened

Chat_template validation incompatible with vLLM-style inference (add_generation_prompt=True with single user message).

Hi, thanks for the great work on this model!

I’m reporting a potential issue with the current chat_template logic in the tokenizer config. Specifically:

The template contains the following hard validation:

{% if messages|length != 2 or messages[0].role != 'user' or messages[1].role != 'assistant' %}
  {{ raise_exception('This template only accepts exactly one user message followed by one assistant message') }}
{% endif %}

This prevents generation-time usage where only a single user message is passed and add_generation_prompt=True is used. This is a common use case in vLLM, OpenAI-compatible API, or any apply_chat_template(messages, add_generation_prompt=True) scenario.
For example, usages like below will fail:

messages = [{"role": "user", "content": "What is 2 + 2?"}]
tokenizer.apply_chat_template(messages, add_generation_prompt=True)
raise_exception('This template only accepts exactly one user message followed by one assistant message')

Sorry I can't open a PR for this since the tokenizer config is manage by git lfs. But I would suggest the following change:

{% if messages|length == 1 and messages[0].role == "user" %}
  {% set user_content = messages[0].content %}
  {% set assistant_content = None %}
{% elif messages|length == 2 and messages[0].role == "user" and messages[1].role == "assistant" %}
  {% set user_content = messages[0].content %}
  {% set assistant_content = messages[1].content %}
{% else %}
  {{ raise_exception("This template expects one user message (optionally followed by assistant reply).") }}
{% endif %}

Would you consider relaxing the validation so that add_generation_prompt=True works with just one user message? This would make the model more compatible with downstream serving tools like vLLM, FastChat, and OpenAI API wrappers.

Thanks again!

PS: I came here because this model is used as a unit test for verl, since this model is changed, every test case using this model failed, including my own PR ci. So fixing the model will be really helpful for me and many other contributors.

sincerely apologize. I will restore the original version and fix the model.

Fixed now.

sincerely apologize. I will restore the original version and fix the model.

Thank you for the timely response. I posted this issue in https://github.com/volcengine/verl/issues/2283. Do you think I can close it directly?

PS: If you need to modify this model further and use it for your own projects, I think you can @ Haibin in this issue and ask him to host a rm model on the verl huggingface account, they just changed the code in some other ci to host some dataset on their huggingface account. So I think it's completely reasonable to ask them to host a model for this test case.

Yes, the test model will be added to an official HF repo later.

Yes, the test model will be added to an official HF repo later.

Cool, I'll close this discussion right now and close the github repo issue right after the test is hosted on the official HF repo.
Thank you for the patience and contribution.

OldKingMeister changed discussion status to closed

Sign up or log in to comment