LM Studio prompt template for MLX quants.
For those who are struggling with launching the LM Studio MLX quantized version of it, here is the Gemini rewritten template, that works just fine:
{{- bos_token }}
{%- if not tools is defined %}
{%- set tools = none %}
{%- endif %}
{%- if not enable_thinking is defined %}
{%- set enable_thinking = false %}
{%- endif %}
{%- if messages[0]['role'] == 'system' %}
{%- set system_message = messages[0]['content']|trim %}
{%- set messages = messages[1:] %}
{%- else %}
{%- set system_message = "" %}
{%- endif %}
{% set has_system_content = (system_message != '') or (tools is not none) or enable_thinking %}
{% if has_system_content %}
{{- "<|start_header_id|>system<|end_header_id|> \n\n" }}
{% if enable_thinking %}
{{- "Enable deep thinking subroutine." }}
{% if system_message != '' or tools is not none %}
{{- " \n\n" }}
{% endif %}
{% endif %}
{% if system_message != '' %}
{{- system_message }}
{% if tools is not none %}
{{- " \n\n" }}
{% endif %}
{% endif %}
{% if tools is not none %}
{{- "Available Tools: \n" }}
{% for t in tools %}
{{- t | tojson(indent=4) }}
{{- " \n\n" }}
{% endfor %}
{% endif %}
{{- "<|eot_id|>" }}
{% endif %}
{%- for message in messages %}
{%- if not (message.role == "ipython" or message.role == "tool" or message.role == "tool_results" or (message.tool_calls is defined and message.tool_calls is not none)) %}
{{- '<|start_header_id|>' + message['role'] + '<|end_header_id|> \n\n' }}
{%- if message['content'] is string %}
{{- message['content'] | trim }}
{%- else %}
{%- for item in message['content'] %}
{%- if item.type == 'text' %}
{{- item.text | trim }}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- '<|eot_id|>' }}
{%- elif message.tool_calls is defined and message.tool_calls is not none %}
{{- "<|start_header_id|>assistant<|end_header_id|> \n\n" }}
{%- if message['content'] is string %}
{{- message['content'] | trim }}
{%- else %}
{%- for item in message['content'] %}
{%- if item.type == 'text' %}
{{- item.text | trim }}
{%- if item.text | trim != "" %}
{{- " \n\n" }}
{%- endif %}
{%- endif %}
{%- endfor %}
{%- endif %}
{{- "[" }}
{%- for tool_call in message.tool_calls %}
{%- set out = tool_call.function|tojson %}
{%- if not tool_call.id is defined %}
{{- out }}
{%- else %}
{{- out[:-1] }}
{{- ', "id": "' + tool_call.id + '"}' }}
{%- endif %}
{%- if not loop.last %}
{{- ", " }}
{%- else %}
{{- "]<|eot_id|>" }}
{%- endif %}
{%- endfor %}
{%- elif message.role == "ipython" or message["role"] == "tool_results" or message["role"] == "tool" %}
{{- "<|start_header_id|>ipython<|end_header_id|> \n\n" }}
{%- if message.tool_call_id is defined and message.tool_call_id != '' %}
{{- '{"content": ' + (message.content | tojson) + ', "call_id": "' + message.tool_call_id + '"}' }}
{%- else %}
{{- '{"content": ' + (message.content | tojson) + '}' }}
{%- endif %}
{{- "<|eot_id|>" }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|start_header_id|>assistant<|end_header_id|> \n\n' }}
{%- endif %}
For those who are struggling with launching the LM Studio version of it, here is the Gemini rewritten template:
I tried the smaller 14B and 32B (GGUF versions from Bartowski) and they run fine with no template issues. Do you mean issues with enabling CoT? That's easy, you just have to put "Enable deep thinking subroutine." (without quotes) into system prompt. I think it's pretty neat that you can use that sentence like a switch to turn thinking process on, otherwise it works like a regular model.
For those who are struggling with launching the LM Studio version of it, here is the Gemini rewritten template:
I tried the smaller 14B and 32B (GGUF versions from Bartowski) and they run fine with no template issues. Do you mean issues with enabling CoT? That's easy, you just have to put "Enable deep thinking subroutine." (without quotes) into system prompt. I think it's pretty neat that you can use that sentence like a switch to turn thinking process on, otherwise it works like a regular model.
In my case, the issue was with MLX quant. I believe that is the important context. Thanks.
That's easy, you just have to put "Enable deep thinking subroutine." (without quotes) into system prompt. I think it's pretty neat that you can use that sentence like a switch to turn thinking process on, otherwise it works like a regular model.
That's crazy easy. I'm using Bartowski's llama 8B on my poor old GPU, and have been messing with stuff I don't understand for hours. Thank you!!