<think> tag not closed during tool use

#4
by bjodah - opened

So I encountered a bug in which llama.cpp crashes (reported here).

I think it might be the jinja template in your model that's lacking a tag during tool calls.
If I use bartowski's quant, I get no crash and the tool call generated looks like this:
"content":"<think>\n\n</think>\n\n<tool_call>\n{\"name\": \"run_python_script\", \"arguments\": {\"source\": \"from datetime import datetime\\nstart = datetime(1999, 12, 24)\\ndelta = datetime(2025, 5, 19) - start\\nprint(delta.days)\", \"args\": []}}\n</tool_call>"

When I use unsloth's quant, it looks like this:
"<think>\n\n<tool_call>\n{\"name\": \"run_python_script\", \"arguments\": {\"source\": \"#!/usr/bin/env python\\nfrom datetime import datetime\\n\\n# Parse the dates\\ndate1 = datetime.strptime('1999-12-24', '%Y-%m-%d')\\ndate2 = datetime.strptime('2025-05-19', '%Y-%m-%d')\\n\\n# Calculate the difference in days\\ndelta = date2 - date1\\n\\n# Return the number of days\\nprint(delta.days)\\n\", \"args\": []}}\n</tool_call>"
(I got that from gdb while debugging llama.cpp crashing, see linked github issue for all details)

If you want to reproduce the failure you can use this script:
https://github.com/bjodah/bug-reproducer-llamacpp-partial-parse/blob/5a84945164f1aa71dee4f1d151ab9959a3812313/run.sh#L20
Here's the diff I used for bartowski's quant:

diff --git a/run.sh b/run.sh
index 7e086bb..06d031b 100755
--- a/run.sh
+++ b/run.sh
@@ -17,7 +17,7 @@ URL_BASE=http://localhost:$PORT
 gdb -ex r -args \
     llama-server \
     --port $PORT \
-    --hf-repo unsloth/Qwen3-4B-GGUF:Q8_0 \
+    --hf-repo bartowski/Qwen_Qwen3-1.7B-GGUF:Q8_0 \
     --n-gpu-layers 999 \
     --jinja \
     --cache-type-k q8_0 \

(bartowski's 4B model didn't call a tool for this particular seed etc., so I tried 1.7B).

Sign up or log in to comment