Tool calling error
500: Value is not callable: null at row 62, column 114:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "") | replace(" ", "") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 62, column 21:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "") | replace(" ", "") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 61, column 55:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "") | replace(" ", "") | replace("$", "") %}
at row 61, column 17:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "") | replace(" ", "") | replace("$", "") %}
at row 60, column 48:
{%- set handled_keys = ['type', 'description', 'enum', 'required'] %}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 60, column 13:
{%- set handled_keys = ['type', 'description', 'enum', 'required'] %}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 49, column 80:
{{- '\n' }}
{%- for param_name, param_fields in tool.parameters.properties|items %}
^
{{- '\n' }}
at row 49, column 9:
{{- '\n' }}
{%- for param_name, param_fields in tool.parameters.properties|items %}
^
{{- '\n' }}
at row 42, column 29:
{{- "" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 42, column 5:
{{- "" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 39, column 51:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 39, column 1:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 1, column 69:
{#- Copyright 2025-present the Unsloth team. All rights reserved. #}
^
{#- Licensed under the Apache License, Version 2.0 (the "License") #}
Could you provide an example where this is falling - that would be very helpful thank you!
Hello. I am trying to use the model with my mcp server via open webui. Llama.cpp outputs this error on tool calls: main: server is listening on http://0.0.0.0:8089 - starting the main loop
srv update_slots: all slots are idle
srv log_server_r: request: GET /v1/models 192.168.4.78 200
got exception: {"code":500,"message":"Value is not callable: null at row 62, column 114:\n {%- if json_key not in handled_keys %}\n {%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}\n ^\n {%- if param_fields[json_key] is mapping %}\n at row 62, column 21:\n {%- if json_key not in handled_keys %}\n {%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}\n ^\n {%- if param_fields[json_key] is mapping %}\n at row 61, column 55:\n {%- for json_key in param_fields %}\n {%- if json_key not in handled_keys %}\n ^\n {%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}\n at row 61, column 17:\n {%- for json_key in param_fields %}\n {%- if json_key not in handled_keys %}\n ^\n {%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}\n at row 60, column 48:\n {%- set handled_keys = ['type', 'description', 'enum', 'required'] %}\n {%- for json_key in param_fields %}\n ^\n {%- if json_key not in handled_keys %}\n at row 60, column 13:\n {%- set handled_keys = ['type', 'description', 'enum', 'required'] %}\n {%- for json_key in param_fields %}\n ^\n {%- if json_key not in handled_keys %}\n at row 49, column 80:\n {{- '\n' }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n ^\n {{- '\n' }}\n at row 49, column 9:\n {{- '\n' }}\n {%- for param_name, param_fields in tool.parameters.properties|items %}\n ^\n {{- '\n' }}\n at row 42, column 29:\n {{- "" }}\n {%- for tool in tools %}\n ^\n {%- if tool.function is defined %}\n at row 42, column 5:\n {{- "" }}\n {%- for tool in tools %}\n ^\n {%- if tool.function is defined %}\n at row 39, column 51:\n{%- endif %}\n{%- if tools is iterable and tools | length > 0 %}\n ^\n {{- "\n\nYou have access to the following functions:\n\n" }}\n at row 39, column 1:\n{%- endif %}\n{%- if tools is iterable and tools | length > 0 %}\n^\n {{- "\n\nYou have access to the following functions:\n\n" }}\n at row 1, column 69:\n{#- Copyright 2025-present the Unsloth team. All rights reserved. #}\n ^\n{#- Licensed under the Apache License, Version 2.0 (the "License") #}\n","type":"server_error"}
srv log_server_r: request: POST /v1/chat/completions 192.168.4.78 500
Is there some other debugging information that would be useful?
I'm getting the same error with UD_Q5_K_XL and llama.cpp (using --jinja and other flags in the docs) and I'm using Qwen CLI:
✕ [API Error: OpenAI API error: 500 Value is not callable: null at row 62,
column 114:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "_") |
replace(" ", "_") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 62, column 21:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "_") |
replace(" ", "_") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 61, column 55:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "_") |
replace(" ", "_") | replace("$", "") %}
at row 61, column 17:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "_") |
replace(" ", "_") | replace("$", "") %}
at row 60, column 48:
{%- set handled_keys = ['type', 'description', 'enum', 'required']
%}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 60, column 13:
{%- set handled_keys = ['type', 'description', 'enum', 'required']
%}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 49, column 80:
{{- '\n<parameters>' }}
{%- for param_name, param_fields in tool.parameters.properties|items
%}
^
{{- '\n<parameter>' }}
at row 49, column 9:
{{- '\n<parameters>' }}
{%- for param_name, param_fields in tool.parameters.properties|items
%}
^
{{- '\n<parameter>' }}
at row 42, column 29:
{{- "<tools>" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 42, column 5:
{{- "<tools>" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 39, column 51:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 39, column 1:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 1, column 69:
{#- Copyright 2025-present the Unsloth team. All rights reserved. #}
^
{#- Licensed under the Apache License, Version 2.0 (the "License") #}
]
Reproducibility steps below, happy to provide any more info or detail!
Use Continue within VS code
Serve the Q_4_XL quant with llama.cpp ( I used Ramalama to execute it within a container, using Vulkan in this example)
- Logs:
❯ ramalama --debug serve --image quay.io/ramalama/ramalama:latest -c 50000 --temp 0.7 --runtime-args="--top-k 20 --top-p 0.8 --frequency-penalty 1.05" hf://unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf
2025-07-31 12:36:22 - DEBUG - run_cmd: podman inspect quay.io/ramalama/rocm:0.11
2025-07-31 12:36:22 - DEBUG - Working directory: None
2025-07-31 12:36:22 - DEBUG - Ignore stderr: False
2025-07-31 12:36:22 - DEBUG - Ignore all: True
2025-07-31 12:36:22 - DEBUG - Checking if 8080 is available
2025-07-31 12:36:22 - DEBUG - exec_cmd: podman run --rm --label ai.ramalama.model=hf://unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf --label ai.ramalama.engine=podman --label ai.ramalama.runtime=llama.cpp --label ai.ramalama.port=8080 --label ai.ramalama.command=serve --device /dev/dri --device /dev/kfd --device /dev/accel -e HIP_VISIBLE_DEVICES=0 -p 8080:8080 --security-opt=label=disable --cap-drop=all --security-opt=no-new-privileges --pull newer --label ai.ramalama --name ramalama_O3d5RTPz47 --env=HOME=/tmp --init --mount=type=bind,src=/var/home/kush/.local/share/ramalama/store/huggingface/unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf/blobs/sha256-89d766f4653c43105922c15bcb5ceec053990f571e94d8535f9dd7098a15ba4c,destination=/mnt/models/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf,ro quay.io/ramalama/ramalama:latest llama-server --port 8080 --model /mnt/models/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf --no-warmup --jinja --log-colors --alias unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf --ctx-size 50000 --temp 0.7 --cache-reuse 256 --top-k 20 --top-p 0.8 --frequency-penalty 1.05 -v -ngl 999 --threads 16 --host 0.0.0.0
ggml_vulkan: Found 1 Vulkan devices:
ggml_vulkan: 0 = Radeon 8060S Graphics (RADV GFX1151) (radv) | uma: 1 | fp16: 1 | bf16: 0 | warp size: 64 | shared memory: 65536 | int dot: 1 | matrix cores: KHR_coopmat
build: 5985 (3f4fc97f) with cc (GCC) 15.1.1 20250521 (Red Hat 15.1.1-2) for x86_64-redhat-linux
system info: n_threads = 16, n_threads_batch = 16, total_threads = 32
system_info: n_threads = 16 (n_threads_batch = 16) / 32 | CPU : SSE3 = 1 | SSSE3 = 1 | AVX = 1 | AVX2 = 1 | F16C = 1 | FMA = 1 | BMI2 = 1 | LLAMAFILE = 1 | OPENMP = 1 | REPACK = 1 |
main: binding port with default address family
main: HTTP server is listening, hostname: 0.0.0.0, port: 8080, http threads: 31
main: loading model
srv load_model: loading model '/mnt/models/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf'
llama_model_load_from_file_impl: using device Vulkan0 (Radeon 8060S Graphics (RADV GFX1151)) - 64997 MiB free
llama_model_loader: loaded meta data with 42 key-value pairs and 579 tensors from /mnt/models/Qwen3-Coder-30B-A3B-Instruct-UD-Q4_K_XL.gguf (version GGUF V3 (latest))
llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
llama_model_loader: - kv 0: general.architecture str = qwen3moe
llama_model_loader: - kv 1: general.type str = model
llama_model_loader: - kv 2: general.name str = Qwen3-Coder-30B-A3B-Instruct
llama_model_loader: - kv 3: general.finetune str = Instruct
llama_model_loader: - kv 4: general.basename str = Qwen3-Coder-30B-A3B-Instruct
llama_model_loader: - kv 5: general.quantized_by str = Unsloth
llama_model_loader: - kv 6: general.size_label str = 30B-A3B
llama_model_loader: - kv 7: general.license str = apache-2.0
llama_model_loader: - kv 8: general.license.link str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 9: general.repo_url str = https://huggingface.co/unsloth
llama_model_loader: - kv 10: general.base_model.count u32 = 1
llama_model_loader: - kv 11: general.base_model.0.name str = Qwen3 Coder 30B A3B Instruct
llama_model_loader: - kv 12: general.base_model.0.organization str = Qwen
llama_model_loader: - kv 13: general.base_model.0.repo_url str = https://huggingface.co/Qwen/Qwen3-Cod...
llama_model_loader: - kv 14: general.tags arr[str,2] = ["unsloth", "text-generation"]
llama_model_loader: - kv 15: qwen3moe.block_count u32 = 48
llama_model_loader: - kv 16: qwen3moe.context_length u32 = 262144
llama_model_loader: - kv 17: qwen3moe.embedding_length u32 = 2048
llama_model_loader: - kv 18: qwen3moe.feed_forward_length u32 = 5472
llama_model_loader: - kv 19: qwen3moe.attention.head_count u32 = 32
llama_model_loader: - kv 20: qwen3moe.attention.head_count_kv u32 = 4
llama_model_loader: - kv 21: qwen3moe.rope.freq_base f32 = 10000000.000000
llama_model_loader: - kv 22: qwen3moe.attention.layer_norm_rms_epsilon f32 = 0.000001
llama_model_loader: - kv 23: qwen3moe.expert_used_count u32 = 8
llama_model_loader: - kv 24: qwen3moe.attention.key_length u32 = 128
llama_model_loader: - kv 25: qwen3moe.attention.value_length u32 = 128
llama_model_loader: - kv 26: qwen3moe.expert_count u32 = 128
llama_model_loader: - kv 27: qwen3moe.expert_feed_forward_length u32 = 768
llama_model_loader: - kv 28: qwen3moe.expert_shared_feed_forward_length u32 = 0
llama_model_loader: - kv 29: tokenizer.ggml.model str = gpt2
llama_model_loader: - kv 30: tokenizer.ggml.pre str = qwen2
llama_model_loader: - kv 31: tokenizer.ggml.tokens arr[str,151936] = ["!", "\"", "#", "$", "%", "&", "'", ...
llama_model_loader: - kv 32: tokenizer.ggml.token_type arr[i32,151936] = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, ...
llama_model_loader: - kv 33: tokenizer.ggml.merges arr[str,151387] = ["Ġ Ġ", "ĠĠ ĠĠ", "i n", "Ġ t",...
llama_model_loader: - kv 34: tokenizer.ggml.eos_token_id u32 = 151645
llama_model_loader: - kv 35: tokenizer.ggml.padding_token_id u32 = 151654
llama_model_loader: - kv 36: tokenizer.ggml.add_bos_token bool = false
llama_model_loader: - kv 37: tokenizer.chat_template str = {#- Copyright 2025-present the Unslot...
llama_model_loader: - kv 38: general.quantization_version u32 = 2
llama_model_loader: - kv 39: general.file_type u32 = 15
llama_model_loader: - kv 40: quantize.imatrix.file str = Qwen3-Coder-30B-A3B-Instruct-GGUF/ima...
llama_model_loader: - kv 41: quantize.imatrix.entries_count u32 = 383
llama_model_loader: - type f32: 241 tensors
llama_model_loader: - type q4_K: 292 tensors
llama_model_loader: - type q5_K: 35 tensors
llama_model_loader: - type q6_K: 11 tensors
print_info: file format = GGUF V3 (latest)
print_info: file type = Q4_K - Medium
print_info: file size = 16.45 GiB (4.63 BPW)
init_tokenizer: initializing tokenizer for type 2
load: control token: 151660 '<|fim_middle|>' is not marked as EOG
load: control token: 151659 '<|fim_prefix|>' is not marked as EOG
load: control token: 151653 '<|vision_end|>' is not marked as EOG
load: control token: 151648 '<|box_start|>' is not marked as EOG
load: control token: 151646 '<|object_ref_start|>' is not marked as EOG
load: control token: 151649 '<|box_end|>' is not marked as EOG
load: control token: 151655 '<|image_pad|>' is not marked as EOG
load: control token: 151651 '<|quad_end|>' is not marked as EOG
load: control token: 151647 '<|object_ref_end|>' is not marked as EOG
load: control token: 151652 '<|vision_start|>' is not marked as EOG
load: control token: 151654 '<|vision_pad|>' is not marked as EOG
load: control token: 151656 '<|video_pad|>' is not marked as EOG
load: control token: 151644 '<|im_start|>' is not marked as EOG
load: control token: 151661 '<|fim_suffix|>' is not marked as EOG
load: control token: 151650 '<|quad_start|>' is not marked as EOG
load: special tokens cache size = 26
load: token to piece cache size = 0.9311 MB
print_info: arch = qwen3moe
print_info: vocab_only = 0
print_info: n_ctx_train = 262144
print_info: n_embd = 2048
print_info: n_layer = 48
print_info: n_head = 32
print_info: n_head_kv = 4
print_info: n_rot = 128
print_info: n_swa = 0
print_info: is_swa_any = 0
print_info: n_embd_head_k = 128
print_info: n_embd_head_v = 128
print_info: n_gqa = 8
print_info: n_embd_k_gqa = 512
print_info: n_embd_v_gqa = 512
print_info: f_norm_eps = 0.0e+00
print_info: f_norm_rms_eps = 1.0e-06
print_info: f_clamp_kqv = 0.0e+00
print_info: f_max_alibi_bias = 0.0e+00
print_info: f_logit_scale = 0.0e+00
print_info: f_attn_scale = 0.0e+00
print_info: n_ff = 5472
print_info: n_expert = 128
print_info: n_expert_used = 8
print_info: causal attn = 1
print_info: pooling type = 0
print_info: rope type = 2
print_info: rope scaling = linear
print_info: freq_base_train = 10000000.0
print_info: freq_scale_train = 1
print_info: n_ctx_orig_yarn = 262144
print_info: rope_finetuned = unknown
print_info: model type = 30B.A3B
print_info: model params = 30.53 B
print_info: general.name = Qwen3-Coder-30B-A3B-Instruct
print_info: n_ff_exp = 768
print_info: vocab type = BPE
print_info: n_vocab = 151936
print_info: n_merges = 151387
print_info: BOS token = 11 ','
print_info: EOS token = 151645 '<|im_end|>'
print_info: EOT token = 151645 '<|im_end|>'
print_info: PAD token = 151654 '<|vision_pad|>'
print_info: LF token = 198 'Ċ'
print_info: FIM PRE token = 151659 '<|fim_prefix|>'
print_info: FIM SUF token = 151661 '<|fim_suffix|>'
print_info: FIM MID token = 151660 '<|fim_middle|>'
print_info: FIM PAD token = 151662 '<|fim_pad|>'
print_info: FIM REP token = 151663 '<|repo_name|>'
print_info: FIM SEP token = 151664 '<|file_sep|>'
print_info: EOG token = 151643 '<|endoftext|>'
print_info: EOG token = 151645 '<|im_end|>'
print_info: EOG token = 151662 '<|fim_pad|>'
print_info: EOG token = 151663 '<|repo_name|>'
print_info: EOG token = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = true)
...
load_tensors: offloaded 49/49 layers to GPU
load_tensors: Vulkan0 model buffer size = 16674.36 MiB
load_tensors: CPU_Mapped model buffer size = 166.92 MiB
....................................................................................................
llama_context: constructing llama_context
llama_context: non-unified KV cache requires ggml_set_rows() - forcing unified KV cache
llama_context: n_seq_max = 1
llama_context: n_ctx = 50000
llama_context: n_ctx_per_seq = 50000
llama_context: n_batch = 2048
llama_context: n_ubatch = 512
llama_context: causal_attn = 1
llama_context: flash_attn = 0
llama_context: kv_unified = true
llama_context: freq_base = 10000000.0
llama_context: freq_scale = 1
llama_context: n_ctx_per_seq (50000) < n_ctx_train (262144) -- the full capacity of the model will not be utilized
set_abort_callback: call
llama_context: Vulkan_Host output buffer size = 0.58 MiB
create_memory: n_ctx = 50016 (padded)
llama_kv_cache_unified: layer 0: dev = Vulkan0
...
llama_kv_cache_unified: Vulkan0 KV buffer size = 4689.00 MiB
llama_kv_cache_unified: size = 4689.00 MiB ( 50016 cells, 48 layers, 1/ 1 seqs), K (f16): 2344.50 MiB, V (f16): 2344.50 MiB
llama_kv_cache_unified: LLAMA_SET_ROWS=0, using old ggml_cpy() method for backwards compatibility
llama_context: enumerating backends
llama_context: backend_ptrs.size() = 2
llama_context: max_nodes = 4632
llama_context: worst-case: n_tokens = 512, n_seqs = 1, n_outputs = 0
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
graph_reserve: reserving a graph for ubatch with n_tokens = 1, n_seqs = 1, n_outputs = 1
graph_reserve: reserving a graph for ubatch with n_tokens = 512, n_seqs = 1, n_outputs = 512
llama_context: Vulkan0 compute buffer size = 3247.69 MiB
llama_context: Vulkan_Host compute buffer size = 101.70 MiB
llama_context: graph nodes = 3270
llama_context: graph splits = 2
clear_adapter_lora: call
common_init_from_params: added <|endoftext|> logit bias = -inf
common_init_from_params: added <|im_end|> logit bias = -inf
common_init_from_params: added <|fim_pad|> logit bias = -inf
common_init_from_params: added <|repo_name|> logit bias = -inf
common_init_from_params: added <|file_sep|> logit bias = -inf
common_init_from_params: setting dry_penalty_last_n to ctx_size = 50016
srv init: initializing slots, n_slots = 1
slot init: id 0 | task -1 | new slot n_ctx_slot = 50016
slot reset: id 0 | task -1 |
main: model loaded
main: chat template, chat_template: {#- Copyright 2025-present the Unsloth team. All rights reserved. #}
{#- Licensed under the Apache License, Version 2.0 (the "License") #}
{#- Edits made by Unsloth to fix the chat template #}
{% macro render_item_list(item_list, tag_name='required') %}
{%- if item_list is defined and item_list is iterable and item_list | length > 0 %}
{%- if tag_name %}{{- '\n<' ~ tag_name ~ '>' -}}{% endif %}
{{- '[' }}
{%- for item in item_list -%}
{%- if loop.index > 1 %}{{- ", "}}{% endif -%}
{%- if item is string -%}
{{ "`" ~ item ~ "`" }}
{%- else -%}
{{ item }}
{%- endif -%}
{%- endfor -%}
{{- ']' }}
{%- if tag_name %}{{- '</' ~ tag_name ~ '>' -}}{% endif %}
{%- endif %}
{% endmacro %}
{%- if messages[0]["role"] == "system" %}
{%- set system_message = messages[0]["content"] %}
{%- set loop_messages = messages[1:] %}
{%- else %}
{%- set loop_messages = messages %}
{%- endif %}
{%- if not tools is defined %}
{%- set tools = [] %}
{%- endif %}
{%- if system_message is defined %}
{{- "<|im_start|>system\n" + system_message }}
{%- else %}
{%- if tools is iterable and tools | length > 0 %}
{{- "<|im_start|>system\nYou are Qwen, a helpful AI assistant that can interact with a computer to solve tasks." }}
{%- endif %}
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
{{- "\n\nYou have access to the following functions:\n\n" }}
{{- "<tools>" }}
{%- for tool in tools %}
{%- if tool.function is defined %}
{%- set tool = tool.function %}
{%- endif %}
{{- "\n<function>\n<name>" ~ tool.name ~ "</name>" }}
{{- '\n<description>' ~ (tool.description | trim) ~ '</description>' }}
{{- '\n<parameters>' }}
{%- for param_name, param_fields in tool.parameters.properties|items %}
{{- '\n<parameter>' }}
{{- '\n<name>' ~ param_name ~ '</name>' }}
{%- if param_fields.type is defined %}
{{- '\n<type>' ~ (param_fields.type | string) ~ '</type>' }}
{%- endif %}
{%- if param_fields.description is defined %}
{{- '\n<description>' ~ (param_fields.description | trim) ~ '</description>' }}
{%- endif %}
{{- render_item_list(param_fields.enum, 'enum') }}
{%- set handled_keys = ['type', 'description', 'enum', 'required'] %}
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}
{%- if param_fields[json_key] is mapping %}
{{- '\n<' ~ normed_json_key ~ '>' ~ (param_fields[json_key] | tojson | safe) ~ '</' ~ normed_json_key ~ '>' }}
{%- else %}
{{- '\n<' ~ normed_json_key ~ '>' ~ (param_fields[json_key] | string) ~ '</' ~ normed_json_key ~ '>' }}
{%- endif %}
{%- endif %}
{%- endfor %}
{{- render_item_list(param_fields.required, 'required') }}
{{- '\n</parameter>' }}
{%- endfor %}
{{- render_item_list(tool.parameters.required, 'required') }}
{{- '\n</parameters>' }}
{%- if tool.return is defined %}
{%- if tool.return is mapping %}
{{- '\n<return>' ~ (tool.return | tojson | safe) ~ '</return>' }}
{%- else %}
{{- '\n<return>' ~ (tool.return | string) ~ '</return>' }}
{%- endif %}
{%- endif %}
{{- '\n</function>' }}
{%- endfor %}
{{- "\n</tools>" }}
{{- '\n\nIf you choose to call a function ONLY reply in the following format with NO suffix:\n\n<tool_call>\n<function=example_function_name>\n<parameter=example_parameter_1>\nvalue_1\n</parameter>\n<parameter=example_parameter_2>\nThis is the value for the second parameter\nthat can span\nmultiple lines\n</parameter>\n</function>\n</tool_call>\n\n<IMPORTANT>\nReminder:\n- Function calls MUST follow the specified format: an inner <function=...></function> block must be nested within <tool_call></tool_call> XML tags\n- Required parameters MUST be specified\n- You may provide optional reasoning for your function call in natural language BEFORE the function call, but NOT after\n- If there is no function call available, answer the question like normal with your current knowledge and do not tell the user about function calls\n</IMPORTANT>' }}
{%- endif %}
{%- if system_message is defined %}
{{- '<|im_end|>\n' }}
{%- else %}
{%- if tools is iterable and tools | length > 0 %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- endif %}
{%- for message in loop_messages %}
{%- if message.role == "assistant" and message.tool_calls is defined and message.tool_calls is iterable and message.tool_calls | length > 0 %}
{{- '<|im_start|>' + message.role }}
{%- if message.content is defined and message.content is string and message.content | trim | length > 0 %}
{{- '\n' + message.content | trim + '\n' }}
{%- endif %}
{%- for tool_call in message.tool_calls %}
{%- if tool_call.function is defined %}
{%- set tool_call = tool_call.function %}
{%- endif %}
{{- '\n<tool_call>\n<function=' + tool_call.name + '>\n' }}
{%- if tool_call.arguments is defined %}
{%- for args_name, args_value in tool_call.arguments|items %}
{{- '<parameter=' + args_name + '>\n' }}
{%- set args_value = args_value if args_value is string else args_value | string %}
{{- args_value }}
{{- '\n</parameter>\n' }}
{%- endfor %}
{%- endif %}
{{- '</function>\n</tool_call>' }}
{%- endfor %}
{{- '<|im_end|>\n' }}
{%- elif message.role == "user" or message.role == "system" or message.role == "assistant" %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>' + '\n' }}
{%- elif message.role == "tool" %}
{%- if loop.previtem and loop.previtem.role != "tool" %}
{{- '<|im_start|>user\n' }}
{%- endif %}
{{- '<tool_response>\n' }}
{{- message.content }}
{{- '\n</tool_response>\n' }}
{%- if not loop.last and loop.nextitem.role != "tool" %}
{{- '<|im_end|>\n' }}
{%- elif loop.last %}
{{- '<|im_end|>\n' }}
{%- endif %}
{%- else %}
{{- '<|im_start|>' + message.role + '\n' + message.content + '<|im_end|>\n' }}
{%- endif %}
{%- endfor %}
{%- if add_generation_prompt %}
{{- '<|im_start|>assistant\n' }}
{%- endif %}
{#- Copyright 2025-present the Unsloth team. All rights reserved. #}
{#- Licensed under the Apache License, Version 2.0 (the "License") #}, example_format: '<|im_start|>system
You are a helpful assistant<|im_end|>
<|im_start|>user
Hello<|im_end|>
<|im_start|>assistant
Hi there<|im_end|>
<|im_start|>user
How are you?<|im_end|>
<|im_start|>assistant
'
main: server is listening on http://0.0.0.0:8080 - starting the main loop
que start_loop: processing new tasks
que start_loop: update slots
srv update_slots: all slots are idle
srv kv_cache_cle: clearing KV cache
que start_loop: waiting for new tasks
- Try to have the
agent
mode in Continue use the built-in tool calling:- Logs:
request: {
"messages": [
{
"role": "system",
"content": "<important_rules>\n You are in agent mode.\n\n Always include the language and file name in the info string when you write code blocks.\n If you are editing \"src/main.py\" for example, your code block should start with '```python src/main.py'\n\n</important_rules>"
},
{
"role": "user",
"content": "Use the web search tool to look up some fun facts"
}
],
"model": "default",
"max_tokens": 4096,
"stream": true,
"tools": [
{
"type": "function",
"function": {
"name": "read_file",
"description": "Use this tool if you need to view the contents of an existing file.",
"parameters": {
"type": "object",
"required": [
"filepath"
],
"properties": {
"filepath": {
"type": "string",
"description": "The path of the file to read, relative to the root of the workspace (NOT uri or absolute path)"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "create_new_file",
"description": "Create a new file. Only use this when a file doesn't exist and should be created",
"parameters": {
"type": "object",
"required": [
"filepath",
"contents"
],
"properties": {
"filepath": {
"type": "string",
"description": "The path where the new file should be created, relative to the root of the workspace"
},
"contents": {
"type": "string",
"description": "The contents to write to the new file"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "run_terminal_command",
"description": "Run a terminal command in the current directory.\nThe shell is not stateful and will not remember any previous commands. When a command is run in the background ALWAYS suggest using shell commands to stop it; NEVER suggest using Ctrl+C. When suggesting subsequent shell commands ALWAYS format them in shell command blocks. Do NOT perform actions requiring special/admin privileges. Choose terminal commands and scripts optimized for darwin and arm64 and shell /bin/zsh.",
"parameters": {
"type": "object",
"required": [
"command"
],
"properties": {
"command": {
"type": "string",
"description": "The command to run. This will be passed directly into the IDE shell."
},
"waitForCompletion": {
"type": "boolean",
"description": "Whether to wait for the command to complete before returning. Default is true. Set to false to run the command in the background. Set to true to run the command in the foreground and wait to collect the output."
}
}
}
}
},
{
"type": "function",
"function": {
"name": "file_glob_search",
"description": "Search for files recursively in the project using glob patterns. Supports ** for recursive directory search. Output may be truncated; use targeted patterns",
"parameters": {
"type": "object",
"required": [
"pattern"
],
"properties": {
"pattern": {
"type": "string",
"description": "Glob pattern for file path matching"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "search_web",
"description": "Performs a web search, returning top results. Use this tool sparingly - only for questions that require specialized, external, and/or up-to-date knowledege. Common programming questions do not require web search.",
"parameters": {
"type": "object",
"required": [
"query"
],
"properties": {
"query": {
"type": "string",
"description": "The natural language search query"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "view_diff",
"description": "View the current diff of working changes",
"parameters": {
"type": "object",
"properties": {}
}
}
},
{
"type": "function",
"function": {
"name": "read_currently_open_file",
"description": "Read the currently open file in the IDE. If the user seems to be referring to a file that you can't see, try using this",
"parameters": {
"type": "object",
"properties": {}
}
}
},
{
"type": "function",
"function": {
"name": "ls",
"description": "List files and folders in a given directory",
"parameters": {
"type": "object",
"properties": {
"dirPath": {
"type": "string",
"description": "The directory path relative to the root of the project. Use forward slash paths like '/'. rather than e.g. '.'"
},
"recursive": {
"type": "boolean",
"description": "If true, lists files and folders recursively. To prevent unexpected large results, use this sparingly"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "create_rule_block",
"description": "Creates a \"rule\" that can be referenced in future conversations. This should be used whenever you want to establish code standards / preferences that should be applied consistently, or when you want to avoid making a mistake again. To modify existing rules, use the edit tool instead.\n\nRule Types:\n- Always: Include only \"rule\" (always included in model context)\n- Auto Attached: Include \"rule\", \"globs\", and/or \"regex\" (included when files match patterns)\n- Agent Requested: Include \"rule\" and \"description\" (AI decides when to apply based on description)\n- Manual: Include only \"rule\" (only included when explicitly mentioned using @ruleName)",
"parameters": {
"type": "object",
"required": [
"name",
"rule"
],
"properties": {
"name": {
"type": "string",
"description": "Short, descriptive name summarizing the rule's purpose (e.g. 'React Standards', 'Type Hints')"
},
"rule": {
"type": "string",
"description": "Clear, imperative instruction for future code generation (e.g. 'Use named exports', 'Add Python type hints'). Each rule should focus on one specific standard."
},
"description": {
"type": "string",
"description": "Description of when this rule should be applied. Required for Agent Requested rules (AI decides when to apply). Optional for other types."
},
"globs": {
"type": "string",
"description": "Optional file patterns to which this rule applies (e.g. ['**/*.{ts,tsx}'] or ['src/**/*.ts', 'tests/**/*.ts'])"
},
"regex": {
"type": "string",
"description": "Optional regex patterns to match against file content. Rule applies only to files whose content matches the pattern (e.g. 'useEffect' for React hooks or '\\bclass\\b' for class definitions)"
},
"alwaysApply": {
"type": "boolean",
"description": "Whether this rule should always be applied. Set to false for Agent Requested and Manual rules. Omit or set to true for Always and Auto Attached rules."
}
}
}
}
},
{
"type": "function",
"function": {
"name": "fetch_url_content",
"description": "Can be used to view the contents of a website using a URL. Do NOT use this for files.",
"parameters": {
"type": "object",
"required": [
"url"
],
"properties": {
"url": {
"type": "string",
"description": "The URL to read"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "grep_search",
"description": "Perform a search over the repository using ripgrep. Output may be truncated, so use targeted queries",
"parameters": {
"type": "object",
"required": [
"query"
],
"properties": {
"query": {
"type": "string",
"description": "The search query to use. Must be a valid ripgrep regex expression, escaped where needed"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "request_rule",
"description": "Use this tool to retrieve additional 'rules' that contain more context/instructions based on their descriptions. Available rules:\nNo rules available.",
"parameters": {
"type": "object",
"required": [
"name"
],
"properties": {
"name": {
"type": "string",
"description": "Name of the rule"
}
}
}
}
},
{
"type": "function",
"function": {
"name": "edit_existing_file",
"description": "Use this tool to edit an existing file. If you don't know the contents of the file, read it first.\n When addressing code modification requests, present a concise code snippet that\n emphasizes only the necessary changes and uses abbreviated placeholders for\n unmodified sections. For example:\n\n ```language /path/to/file\n // ... existing code ...\n\n {{ modified code here }}\n\n // ... existing code ...\n\n {{ another modification }}\n\n // ... rest of code ...\n ```\n\n In existing files, you should always restate the function or class that the snippet belongs to:\n\n ```language /path/to/file\n // ... existing code ...\n\n function exampleFunction() {\n // ... existing code ...\n\n {{ modified code here }}\n\n // ... rest of function ...\n }\n\n // ... rest of code ...\n ```\n\n Since users have access to their complete file, they prefer reading only the\n relevant modifications. It's perfectly acceptable to omit unmodified portions\n at the beginning, middle, or end of files using these \"lazy\" comments. Only\n provide the complete file when explicitly requested. Include a concise explanation\n of changes unless the user specifically asks for code only.\n\nNote this tool CANNOT be called in parallel.",
"parameters": {
"type": "object",
"required": [
"filepath",
"changes"
],
"properties": {
"filepath": {
"type": "string",
"description": "The path of the file to edit, relative to the root of the workspace."
},
"changes": {
"type": "string",
"description": "Any modifications to the file, showing only needed changes. Do NOT wrap this in a codeblock or write anything besides the code changes. In larger files, use brief language-appropriate placeholders for large unmodified sections, e.g. '// ... existing code ...'"
}
}
}
}
}
],
"parallel_tool_calls": false
}
srv params_from_: Grammar: any-tool-call ::= ( read-file-call | create-new-file-call | run-terminal-command-call | file-glob-search-call | search-web-call | view-diff-call | read-currently-open-file-call | ls-call | create-rule-block-call | fetch-url-content-call | grep-search-call | request-rule-call | edit-existing-file-call ) space
boolean ::= ("true" | "false") space
char ::= [^"\\\x7F\x00-\x1F] | [\\] (["\\bfnrt] | "u" [0-9a-fA-F]{4})
create-new-file-args ::= "{" space create-new-file-args-filepath-kv "," space create-new-file-args-contents-kv "}" space
create-new-file-args-contents-kv ::= "\"contents\"" space ":" space string
create-new-file-args-filepath-kv ::= "\"filepath\"" space ":" space string
create-new-file-call ::= "{" space create-new-file-call-name-kv "," space create-new-file-call-arguments-kv "}" space
create-new-file-call-arguments ::= "{" space create-new-file-call-arguments-filepath-kv "," space create-new-file-call-arguments-contents-kv "}" space
create-new-file-call-arguments-contents-kv ::= "\"contents\"" space ":" space string
create-new-file-call-arguments-filepath-kv ::= "\"filepath\"" space ":" space string
create-new-file-call-arguments-kv ::= "\"arguments\"" space ":" space create-new-file-call-arguments
create-new-file-call-name ::= "\"create_new_file\"" space
create-new-file-call-name-kv ::= "\"name\"" space ":" space create-new-file-call-name
create-new-file-function-tag ::= "<function" ( "=create_new_file" | " name=\"create_new_file\"" ) ">" space create-new-file-args "</function>" space
create-rule-block-args ::= "{" space create-rule-block-args-name-kv "," space create-rule-block-args-rule-kv ( "," space ( create-rule-block-args-description-kv create-rule-block-args-description-rest | create-rule-block-args-globs-kv create-rule-block-args-globs-rest | create-rule-block-args-regex-kv create-rule-block-args-regex-rest | create-rule-block-args-alwaysApply-kv ) )? "}" space
create-rule-block-args-alwaysApply-kv ::= "\"alwaysApply\"" space ":" space boolean
create-rule-block-args-description-kv ::= "\"description\"" space ":" space string
create-rule-block-args-description-rest ::= ( "," space create-rule-block-args-globs-kv )? create-rule-block-args-globs-rest
create-rule-block-args-globs-kv ::= "\"globs\"" space ":" space string
create-rule-block-args-globs-rest ::= ( "," space create-rule-block-args-regex-kv )? create-rule-block-args-regex-rest
create-rule-block-args-name-kv ::= "\"name\"" space ":" space string
create-rule-block-args-regex-kv ::= "\"regex\"" space ":" space string
create-rule-block-args-regex-rest ::= ( "," space create-rule-block-args-alwaysApply-kv )?
create-rule-block-args-rule-kv ::= "\"rule\"" space ":" space string
create-rule-block-call ::= "{" space create-rule-block-call-name-kv "," space create-rule-block-call-arguments-kv "}" space
create-rule-block-call-arguments ::= "{" space create-rule-block-call-arguments-name-kv "," space create-rule-block-call-arguments-rule-kv ( "," space ( create-rule-block-call-arguments-description-kv create-rule-block-call-arguments-description-rest | create-rule-block-call-arguments-globs-kv create-rule-block-call-arguments-globs-rest | create-rule-block-call-arguments-regex-kv create-rule-block-call-arguments-regex-rest | create-rule-block-call-arguments-alwaysApply-kv ) )? "}" space
create-rule-block-call-arguments-alwaysApply-kv ::= "\"alwaysApply\"" space ":" space boolean
create-rule-block-call-arguments-description-kv ::= "\"description\"" space ":" space string
create-rule-block-call-arguments-description-rest ::= ( "," space create-rule-block-call-arguments-globs-kv )? create-rule-block-call-arguments-globs-rest
create-rule-block-call-arguments-globs-kv ::= "\"globs\"" space ":" space string
create-rule-block-call-arguments-globs-rest ::= ( "," space create-rule-block-call-arguments-regex-kv )? create-rule-block-call-arguments-regex-rest
create-rule-block-call-arguments-kv ::= "\"arguments\"" space ":" space create-rule-block-call-arguments
create-rule-block-call-arguments-name-kv ::= "\"name\"" space ":" space string
create-rule-block-call-arguments-regex-kv ::= "\"regex\"" space ":" space string
create-rule-block-call-arguments-regex-rest ::= ( "," space create-rule-block-call-arguments-alwaysApply-kv )?
create-rule-block-call-arguments-rule-kv ::= "\"rule\"" space ":" space string
create-rule-block-call-name ::= "\"create_rule_block\"" space
create-rule-block-call-name-kv ::= "\"name\"" space ":" space create-rule-block-call-name
create-rule-block-function-tag ::= "<function" ( "=create_rule_block" | " name=\"create_rule_block\"" ) ">" space create-rule-block-args "</function>" space
edit-existing-file-args ::= "{" space edit-existing-file-args-filepath-kv "," space edit-existing-file-args-changes-kv "}" space
edit-existing-file-args-changes-kv ::= "\"changes\"" space ":" space string
edit-existing-file-args-filepath-kv ::= "\"filepath\"" space ":" space string
edit-existing-file-call ::= "{" space edit-existing-file-call-name-kv "," space edit-existing-file-call-arguments-kv "}" space
edit-existing-file-call-arguments ::= "{" space edit-existing-file-call-arguments-filepath-kv "," space edit-existing-file-call-arguments-changes-kv "}" space
edit-existing-file-call-arguments-changes-kv ::= "\"changes\"" space ":" space string
edit-existing-file-call-arguments-filepath-kv ::= "\"filepath\"" space ":" space string
edit-existing-file-call-arguments-kv ::= "\"arguments\"" space ":" space edit-existing-file-call-arguments
edit-existing-file-call-name ::= "\"edit_existing_file\"" space
edit-existing-file-call-name-kv ::= "\"name\"" space ":" space edit-existing-file-call-name
edit-existing-file-function-tag ::= "<function" ( "=edit_existing_file" | " name=\"edit_existing_file\"" ) ">" space edit-existing-file-args "</function>" space
fetch-url-content-args ::= "{" space fetch-url-content-args-url-kv "}" space
fetch-url-content-args-url-kv ::= "\"url\"" space ":" space string
fetch-url-content-call ::= "{" space fetch-url-content-call-name-kv "," space fetch-url-content-call-arguments-kv "}" space
fetch-url-content-call-arguments ::= "{" space fetch-url-content-call-arguments-url-kv "}" space
fetch-url-content-call-arguments-kv ::= "\"arguments\"" space ":" space fetch-url-content-call-arguments
fetch-url-content-call-arguments-url-kv ::= "\"url\"" space ":" space string
fetch-url-content-call-name ::= "\"fetch_url_content\"" space
fetch-url-content-call-name-kv ::= "\"name\"" space ":" space fetch-url-content-call-name
fetch-url-content-function-tag ::= "<function" ( "=fetch_url_content" | " name=\"fetch_url_content\"" ) ">" space fetch-url-content-args "</function>" space
file-glob-search-args ::= "{" space file-glob-search-args-pattern-kv "}" space
file-glob-search-args-pattern-kv ::= "\"pattern\"" space ":" space string
file-glob-search-call ::= "{" space file-glob-search-call-name-kv "," space file-glob-search-call-arguments-kv "}" space
file-glob-search-call-arguments ::= "{" space file-glob-search-call-arguments-pattern-kv "}" space
file-glob-search-call-arguments-kv ::= "\"arguments\"" space ":" space file-glob-search-call-arguments
file-glob-search-call-arguments-pattern-kv ::= "\"pattern\"" space ":" space string
file-glob-search-call-name ::= "\"file_glob_search\"" space
file-glob-search-call-name-kv ::= "\"name\"" space ":" space file-glob-search-call-name
file-glob-search-function-tag ::= "<function" ( "=file_glob_search" | " name=\"file_glob_search\"" ) ">" space file-glob-search-args "</function>" space
grep-search-args ::= "{" space grep-search-args-query-kv "}" space
grep-search-args-query-kv ::= "\"query\"" space ":" space string
grep-search-call ::= "{" space grep-search-call-name-kv "," space grep-search-call-arguments-kv "}" space
grep-search-call-arguments ::= "{" space grep-search-call-arguments-query-kv "}" space
grep-search-call-arguments-kv ::= "\"arguments\"" space ":" space grep-search-call-arguments
grep-search-call-arguments-query-kv ::= "\"query\"" space ":" space string
grep-search-call-name ::= "\"grep_search\"" space
grep-search-call-name-kv ::= "\"name\"" space ":" space grep-search-call-name
grep-search-function-tag ::= "<function" ( "=grep_search" | " name=\"grep_search\"" ) ">" space grep-search-args "</function>" space
ls-args ::= "{" space (ls-args-dirPath-kv ls-args-dirPath-rest | ls-args-recursive-kv )? "}" space
ls-args-dirPath-kv ::= "\"dirPath\"" space ":" space string
ls-args-dirPath-rest ::= ( "," space ls-args-recursive-kv )?
ls-args-recursive-kv ::= "\"recursive\"" space ":" space boolean
ls-call ::= "{" space ls-call-name-kv "," space ls-call-arguments-kv "}" space
ls-call-arguments ::= "{" space (ls-call-arguments-dirPath-kv ls-call-arguments-dirPath-rest | ls-call-arguments-recursive-kv )? "}" space
ls-call-arguments-dirPath-kv ::= "\"dirPath\"" space ":" space string
ls-call-arguments-dirPath-rest ::= ( "," space ls-call-arguments-recursive-kv )?
ls-call-arguments-kv ::= "\"arguments\"" space ":" space ls-call-arguments
ls-call-arguments-recursive-kv ::= "\"recursive\"" space ":" space boolean
ls-call-name ::= "\"ls\"" space
ls-call-name-kv ::= "\"name\"" space ":" space ls-call-name
ls-function-tag ::= "<function" ( "=ls" | " name=\"ls\"" ) ">" space ls-args "</function>" space
read-currently-open-file-args ::= "{" space "}" space
read-currently-open-file-call ::= "{" space read-currently-open-file-call-name-kv "," space read-currently-open-file-call-arguments-kv "}" space
read-currently-open-file-call-arguments ::= "{" space "}" space
read-currently-open-file-call-arguments-kv ::= "\"arguments\"" space ":" space read-currently-open-file-call-arguments
read-currently-open-file-call-name ::= "\"read_currently_open_file\"" space
read-currently-open-file-call-name-kv ::= "\"name\"" space ":" space read-currently-open-file-call-name
read-currently-open-file-function-tag ::= "<function" ( "=read_currently_open_file" | " name=\"read_currently_open_file\"" ) ">" space read-currently-open-file-args "</function>" space
read-file-args ::= "{" space read-file-args-filepath-kv "}" space
read-file-args-filepath-kv ::= "\"filepath\"" space ":" space string
read-file-call ::= "{" space read-file-call-name-kv "," space read-file-call-arguments-kv "}" space
read-file-call-arguments ::= "{" space read-file-call-arguments-filepath-kv "}" space
read-file-call-arguments-filepath-kv ::= "\"filepath\"" space ":" space string
read-file-call-arguments-kv ::= "\"arguments\"" space ":" space read-file-call-arguments
read-file-call-name ::= "\"read_file\"" space
read-file-call-name-kv ::= "\"name\"" space ":" space read-file-call-name
read-file-function-tag ::= "<function" ( "=read_file" | " name=\"read_file\"" ) ">" space read-file-args "</function>" space
request-rule-args ::= "{" space request-rule-args-name-kv "}" space
request-rule-args-name-kv ::= "\"name\"" space ":" space string
request-rule-call ::= "{" space request-rule-call-name-kv "," space request-rule-call-arguments-kv "}" space
request-rule-call-arguments ::= "{" space request-rule-call-arguments-name-kv "}" space
request-rule-call-arguments-kv ::= "\"arguments\"" space ":" space request-rule-call-arguments
request-rule-call-arguments-name-kv ::= "\"name\"" space ":" space string
request-rule-call-name ::= "\"request_rule\"" space
request-rule-call-name-kv ::= "\"name\"" space ":" space request-rule-call-name
request-rule-function-tag ::= "<function" ( "=request_rule" | " name=\"request_rule\"" ) ">" space request-rule-args "</function>" space
root ::= tool-call
run-terminal-command-args ::= "{" space run-terminal-command-args-command-kv ( "," space ( run-terminal-command-args-waitForCompletion-kv ) )? "}" space
run-terminal-command-args-command-kv ::= "\"command\"" space ":" space string
run-terminal-command-args-waitForCompletion-kv ::= "\"waitForCompletion\"" space ":" space boolean
run-terminal-command-call ::= "{" space run-terminal-command-call-name-kv "," space run-terminal-command-call-arguments-kv "}" space
run-terminal-command-call-arguments ::= "{" space run-terminal-command-call-arguments-command-kv ( "," space ( run-terminal-command-call-arguments-waitForCompletion-kv ) )? "}" space
run-terminal-command-call-arguments-command-kv ::= "\"command\"" space ":" space string
run-terminal-command-call-arguments-kv ::= "\"arguments\"" space ":" space run-terminal-command-call-arguments
run-terminal-command-call-arguments-waitForCompletion-kv ::= "\"waitForCompletion\"" space ":" space boolean
run-terminal-command-call-name ::= "\"run_terminal_command\"" space
run-terminal-command-call-name-kv ::= "\"name\"" space ":" space run-terminal-command-call-name
run-terminal-command-function-tag ::= "<function" ( "=run_terminal_command" | " name=\"run_terminal_command\"" ) ">" space run-terminal-command-args "</function>" space
search-web-args ::= "{" space search-web-args-query-kv "}" space
search-web-args-query-kv ::= "\"query\"" space ":" space string
search-web-call ::= "{" space search-web-call-name-kv "," space search-web-call-arguments-kv "}" space
search-web-call-arguments ::= "{" space search-web-call-arguments-query-kv "}" space
search-web-call-arguments-kv ::= "\"arguments\"" space ":" space search-web-call-arguments
search-web-call-arguments-query-kv ::= "\"query\"" space ":" space string
search-web-call-name ::= "\"search_web\"" space
search-web-call-name-kv ::= "\"name\"" space ":" space search-web-call-name
search-web-function-tag ::= "<function" ( "=search_web" | " name=\"search_web\"" ) ">" space search-web-args "</function>" space
space ::= | " " | "\n"{1,2} [ \t]{0,20}
string ::= "\"" char* "\"" space
tool-call ::= read-file-function-tag | create-new-file-function-tag | run-terminal-command-function-tag | file-glob-search-function-tag | search-web-function-tag | view-diff-function-tag | read-currently-open-file-function-tag | ls-function-tag | create-rule-block-function-tag | fetch-url-content-function-tag | grep-search-function-tag | request-rule-function-tag | edit-existing-file-function-tag | wrappable-tool-call | ( "```\n" | "```json\n" | "```xml\n" ) space wrappable-tool-call space "```" space
view-diff-args ::= "{" space "}" space
view-diff-call ::= "{" space view-diff-call-name-kv "," space view-diff-call-arguments-kv "}" space
view-diff-call-arguments ::= "{" space "}" space
view-diff-call-arguments-kv ::= "\"arguments\"" space ":" space view-diff-call-arguments
view-diff-call-name ::= "\"view_diff\"" space
view-diff-call-name-kv ::= "\"name\"" space ":" space view-diff-call-name
view-diff-function-tag ::= "<function" ( "=view_diff" | " name=\"view_diff\"" ) ">" space view-diff-args "</function>" space
wrappable-tool-call ::= ( any-tool-call | "<tool_call>" space any-tool-call "</tool_call>" | "<function_call>" space any-tool-call "</function_call>" | "<response>" space any-tool-call "</response>" | "<tools>" space any-tool-call "</tools>" | "<json>" space any-tool-call "</json>" | "<xml>" space any-tool-call "</xml>" | "<JSON>" space any-tool-call "</JSON>" ) space
srv params_from_: Grammar lazy: true
srv params_from_: Chat format: Hermes 2 Pro
srv params_from_: Preserved token: 151667
srv params_from_: Preserved token: 151668
srv params_from_: Preserved token: 151657
srv params_from_: Preserved token: 151658
srv params_from_: Not preserved because more than 1 token: <function
srv params_from_: Not preserved because more than 1 token: <tools>
srv params_from_: Not preserved because more than 1 token: </tools>
srv params_from_: Not preserved because more than 1 token: <response>
srv params_from_: Not preserved because more than 1 token: </response>
srv params_from_: Not preserved because more than 1 token: <function_call>
srv params_from_: Not preserved because more than 1 token: </function_call>
srv params_from_: Not preserved because more than 1 token: <json>
srv params_from_: Not preserved because more than 1 token: </json>
srv params_from_: Not preserved because more than 1 token: <JSON>
srv params_from_: Not preserved because more than 1 token: </JSON>
srv params_from_: Preserved token: 73594
srv params_from_: Not preserved because more than 1 token: ```json
srv params_from_: Not preserved because more than 1 token: ```xml
srv params_from_: Grammar trigger word: `<function=read_file>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"read_file"`
srv params_from_: Grammar trigger word: `<function=create_new_file>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"create_new_file"`
srv params_from_: Grammar trigger word: `<function=run_terminal_command>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"run_terminal_command"`
srv params_from_: Grammar trigger word: `<function=file_glob_search>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"file_glob_search"`
srv params_from_: Grammar trigger word: `<function=search_web>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"search_web"`
srv params_from_: Grammar trigger word: `<function=view_diff>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"view_diff"`
srv params_from_: Grammar trigger word: `<function=read_currently_open_file>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"read_currently_open_file"`
srv params_from_: Grammar trigger word: `<function=ls>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"ls"`
srv params_from_: Grammar trigger word: `<function=create_rule_block>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"create_rule_block"`
srv params_from_: Grammar trigger word: `<function=fetch_url_content>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"fetch_url_content"`
srv params_from_: Grammar trigger word: `<function=grep_search>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"grep_search"`
srv params_from_: Grammar trigger word: `<function=request_rule>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"request_rule"`
srv params_from_: Grammar trigger word: `<function=edit_existing_file>`
srv params_from_: Grammar trigger pattern: `<function\s+name\s*=\s*"edit_existing_file"`
srv params_from_: Grammar trigger pattern full: `(?:<think>[\s\S]*?</think>\s*)?(\s*(?:<tool_call>|<function|(?:```(?:json|xml)?
\s*)?(?:<function_call>|<tools>|<xml><json>|<response>)?\s*\{\s*"name"\s*:\s*"(?:read_file|create_new_file|run_terminal_command|file_glob_search|search_web|view_diff|read_currently_open_file|ls|create_rule_block|fetch_url_content|grep_search|request_rule|edit_existing_file)"))[\s\S]*`
srv add_waiting_: add task 0 to waiting list. current waiting = 0 (before add)
que post: new task, id = 0/1, front = 0
que start_loop: processing new tasks
que start_loop: processing task, id = 0
slot get_availabl: id 0 | task -1 | selected slot by lru, t_last = -1
slot reset: id 0 | task -1 |
slot launch_slot_: id 0 | task 0 | launching slot : {"id":0,"id_task":0,"n_ctx":50016,"speculative":false,"is_processing":false,"params":{"n_predict":4096,"seed":4294967295,"temperature":0.699999988079071,...
slot launch_slot_: id 0 | task 0 | processing task
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 1, front = 0
slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 50016, n_keep = 0, n_prompt_tokens = 2340
slot update_slots: id 0 | task 0 | trying to reuse chunks with size > 256, slot.n_past = 0
slot update_slots: id 0 | task 0 | after context reuse, new slot.n_past = 0
slot update_slots: id 0 | task 0 | kv cache rm [0, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 2048, n_tokens = 2048, progress = 0.875214
srv update_slots: decoding batch, n_tokens = 2048
clear_adapter_lora: call
set_embeddings: value = 0
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 1
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 2, front = 0
slot update_slots: id 0 | task 0 | kv cache rm [2048, end)
slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 2340, n_tokens = 292, progress = 1.000000
slot update_slots: id 0 | task 0 | prompt done, n_past = 2340, n_tokens = 292
srv update_slots: decoding batch, n_tokens = 292
clear_adapter_lora: call
set_embeddings: value = 0
Grammar still awaiting trigger after token 40 (`I`)
srv update_chat_: Parsing chat message: I
Parsing input with format Hermes 2 Pro: I
Parsed message: {"role":"assistant","content":"I"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 1, n_remaining = 4095, next token: 40 'I'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 2
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 3, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2341, n_cache_tokens = 2341, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"role":"assistant","content":null}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"I"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 3278 (`'ll`)
srv update_chat_: Parsing chat message: I'll
Parsing input with format Hermes 2 Pro: I'll
Parsed message: {"role":"assistant","content":"I'll"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 2, n_remaining = 4094, next token: 3278 ''ll'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 3
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 4, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2342, n_cache_tokens = 2342, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"'ll"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 2711 (` search`)
srv update_chat_: Parsing chat message: I'll search
Parsing input with format Hermes 2 Pro: I'll search
Parsed message: {"role":"assistant","content":"I'll search"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 3, n_remaining = 4093, next token: 2711 ' search'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 4
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 5, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2343, n_cache_tokens = 2343, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" search"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 369 (` for`)
srv update_chat_: Parsing chat message: I'll search for
Parsing input with format Hermes 2 Pro: I'll search for
Parsed message: {"role":"assistant","content":"I'll search for"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 4, n_remaining = 4092, next token: 369 ' for'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 5
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 6, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2344, n_cache_tokens = 2344, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" for"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 1045 (` some`)
srv update_chat_: Parsing chat message: I'll search for some
Parsing input with format Hermes 2 Pro: I'll search for some
Parsed message: {"role":"assistant","content":"I'll search for some"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 5, n_remaining = 4091, next token: 1045 ' some'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 6
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 7, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2345, n_cache_tokens = 2345, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" some"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 7040 (` interesting`)
srv update_chat_: Parsing chat message: I'll search for some interesting
Parsing input with format Hermes 2 Pro: I'll search for some interesting
Parsed message: {"role":"assistant","content":"I'll search for some interesting"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 6, n_remaining = 4090, next token: 7040 ' interesting'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 7
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 8, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2346, n_cache_tokens = 2346, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" interesting"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 323 (` and`)
srv update_chat_: Parsing chat message: I'll search for some interesting and
Parsing input with format Hermes 2 Pro: I'll search for some interesting and
Parsed message: {"role":"assistant","content":"I'll search for some interesting and"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 7, n_remaining = 4089, next token: 323 ' and'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 8
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 9, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2347, n_cache_tokens = 2347, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" and"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 2464 (` fun`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 8, n_remaining = 4088, next token: 2464 ' fun'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 9
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 10, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2348, n_cache_tokens = 2348, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" fun"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 13064 (` facts`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 9, n_remaining = 4087, next token: 13064 ' facts'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 10
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 11, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2349, n_cache_tokens = 2349, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" facts"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 311 (` to`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 10, n_remaining = 4086, next token: 311 ' to'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 11
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 12, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2350, n_cache_tokens = 2350, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" to"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 4332 (` share`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 11, n_remaining = 4085, next token: 4332 ' share'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 12
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 13, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2351, n_cache_tokens = 2351, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" share"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 448 (` with`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 12, n_remaining = 4084, next token: 448 ' with'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 13
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 14, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2352, n_cache_tokens = 2352, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" with"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 498 (` you`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 13, n_remaining = 4083, next token: 498 ' you'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 14
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 15, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2353, n_cache_tokens = 2353, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":" you"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 624 (`.
`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
Partial parse: (?:(```(?:xml|json)?\n\s*)?(<tool_call>|<function_call>|<tool>|<tools>|<response>|<json>|<xml>|<JSON>)?(\s*\{\s*"name"))|<function=([^>]+)>|<function name="([^"]+)">
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you."}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 14, n_remaining = 4082, next token: 624 '.
'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 15
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 16, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2354, n_cache_tokens = 2354, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"."}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 151657 (`<tool_call>`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
Partial parse: (?:(```(?:xml|json)?\n\s*)?(<tool_call>|<function_call>|<tool>|<tools>|<response>|<json>|<xml>|<JSON>)?(\s*\{\s*"name"))|<function=([^>]+)>|<function name="([^"]+)">
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 15, n_remaining = 4081, next token: 151657 '<tool_call>'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 16
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 17, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2355, n_cache_tokens = 2355, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"\n"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 198 (`
`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
Partial parse: (?:(```(?:xml|json)?\n\s*)?(<tool_call>|<function_call>|<tool>|<tools>|<response>|<json>|<xml>|<JSON>)?(\s*\{\s*"name"))|<function=([^>]+)>|<function name="([^"]+)">
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 16, n_remaining = 4080, next token: 198 '
'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 17
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 18, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2356, n_cache_tokens = 2356, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
Grammar still awaiting trigger after token 27 (`<`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
<
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
<
Partial parse: (?:(```(?:xml|json)?\n\s*)?(<tool_call>|<function_call>|<tool>|<tools>|<response>|<json>|<xml>|<JSON>)?(\s*\{\s*"name"))|<function=([^>]+)>|<function name="([^"]+)">
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n<tool_call>\n"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 17, n_remaining = 4079, next token: 27 '<'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 18
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 19, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2357, n_cache_tokens = 2357, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"<tool_call>\n"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 1688 (`function`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function
Partial parse: (?:(```(?:xml|json)?\n\s*)?(<tool_call>|<function_call>|<tool>|<tools>|<response>|<json>|<xml>|<JSON>)?(\s*\{\s*"name"))|<function=([^>]+)>|<function name="([^"]+)">
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n<tool_call>\n"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 18, n_remaining = 4078, next token: 1688 'function'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 19
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 20, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2358, n_cache_tokens = 2358, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
Grammar still awaiting trigger after token 96598 (`=search`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n<tool_call>\n<function=search"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 19, n_remaining = 4077, next token: 96598 '=search'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 20
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 21, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2359, n_cache_tokens = 2359, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"<function=search"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar still awaiting trigger after token 25960 (`_web`)
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search_web
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search_web
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n<tool_call>\n<function=search_web"}
srv send: sending result for task id = 0
srv send: task id = 0 pushed to result queue
slot process_toke: id 0 | task 0 | n_decoded = 20, n_remaining = 4076, next token: 25960 '_web'
srv update_slots: run slots completed
que start_loop: waiting for new tasks
que start_loop: processing new tasks
que start_loop: processing task, id = 21
que start_loop: update slots
srv update_slots: posting NEXT_RESPONSE
que post: new task, id = 22, front = 0
slot update_slots: id 0 | task 0 | slot decode token, n_ctx = 50016, n_past = 2360, n_cache_tokens = 2360, truncated = 0
srv update_slots: decoding batch, n_tokens = 1
clear_adapter_lora: call
set_embeddings: value = 0
data stream, to_send: data: {"choices":[{"finish_reason":null,"index":0,"delta":{"content":"_web"}}],"created":1753979799,"id":"chatcmpl-syQmS9ST4eaCJelPs8FTaPgktIlDyFdF","model":"default","system_fingerprint":"b5985-3f4fc97f","object":"chat.completion.chunk"}
Grammar triggered on regex: '<function=search_web>
'
srv update_chat_: Parsing chat message: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search_web>
Parsing input with format Hermes 2 Pro: I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search_web>
Failed to parse up to error: [json.exception.parse_error.101] parse error at line 2, column 1: syntax error while parsing value - unexpected end of input; expected '[', '{', or a literal: <<<
>>>
Parsed message: {"role":"assistant","content":"I'll search for some interesting and fun facts to share with you.\n<tool_call>\n\n"}
/lib64/libggml-base.so(+0x2e65) [0x7f642d3f0e65]
/lib64/libggml-base.so(ggml_print_backtrace+0x1ec) [0x7f642d3f122c]
/lib64/libggml-base.so(+0x13119) [0x7f642d401119]
/lib64/libstdc++.so.6(+0x1eadc) [0x7f642d195adc]
/lib64/libstdc++.so.6(_ZSt10unexpectedv+0x0) [0x7f642d17fd3c]
/lib64/libstdc++.so.6(+0x1ed88) [0x7f642d195d88]
llama-server() [0x417101]
llama-server() [0x524bd3]
llama-server() [0x48ef1b]
llama-server() [0x48f6ba]
llama-server() [0x48fe6c]
llama-server() [0x49fbfd]
llama-server() [0x46fb89]
llama-server() [0x431f0c]
/lib64/libc.so.6(+0x35f5) [0x7f642ce6e5f5]
/lib64/libc.so.6(__libc_start_main+0x88) [0x7f642ce6e6a8]
llama-server() [0x433be5]
terminate called after throwing an instance of 'std::runtime_error'
what(): Invalid diff: 'I'll search for some interesting and fun facts to share with you.
<tool_call>
<function=search_web' not found at start of 'I'll search for some interesting and fun facts to share with you.
<tool_call>
Same error
Value is not callable: null at row 62, column 114:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 62, column 21:
{%- if json_key not in handled_keys %}
{%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}
^
{%- if param_fields[json_key] is mapping %}
at row 61, column 55:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}
at row 61, column 17:
{%- for json_key in param_fields %}
{%- if json_key not in handled_keys %}
^
{%- set normed_json_key = json_key | replace("-", "_") | replace(" ", "_") | replace("$", "") %}
at row 60, column 48:
{%- set handled_keys = ['type', 'description', 'enum', 'required'] %}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 60, column 13:
{%- set handled_keys = ['type', 'description', 'enum', 'required'] %}
{%- for json_key in param_fields %}
^
{%- if json_key not in handled_keys %}
at row 49, column 80:
{{- '\n<parameters>' }}
{%- for param_name, param_fields in tool.parameters.properties|items %}
^
{{- '\n<parameter>' }}
at row 49, column 9:
{{- '\n<parameters>' }}
{%- for param_name, param_fields in tool.parameters.properties|items %}
^
{{- '\n<parameter>' }}
at row 42, column 29:
{{- "<tools>" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 42, column 5:
{{- "<tools>" }}
{%- for tool in tools %}
^
{%- if tool.function is defined %}
at row 39, column 51:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 39, column 1:
{%- endif %}
{%- if tools is iterable and tools | length > 0 %}
^
{{- "\n\nYou have access to the following functions:\n\n" }}
at row 1, column 69:
{#- Copyright 2025-present the Unsloth team. All rights reserved. #}
^
{#- Licensed under the Apache License, Version 2.0 (the "License") #}
Can confirm that the llama.cpp server crashes when parsing tool calls.
Most relevant llama.cpp server logs:
[...]
Parsing input with format Hermes 2 Pro: I'll help you implement support for the internal RTC in your STM32CubeIDE project. Let me first explore the project structure to understand how the current RTC implementation works and then implement the requested changes.
First, let me check what files exist related to RTC:
<tool_call>
<function=bash>
Failed to parse up to error: [json.exception.parse_error.101] parse error at line 2, column 1: syntax error while parsing value - unexpected end of input; expected '[', '{', or a literal: <<<
>>>
[...]
#1 0x00007e4e0e46ede3 in ggml_print_backtrace () from /home/xxx/llama.cpp/build/bin/libggml-base.so
#2 0x00007e4e0e47f83f in ggml_uncaught_exception() () from /home/xxx/llama.cpp/build/bin/libggml-base.so
#3 0x00007e4e0debb0da in ?? () from /lib/x86_64-linux-gnu/libstdc++.so.6
#4 0x00007e4e0dea5a55 in std::terminate() () from /lib/x86_64-linux-gnu/libstdc++.so.6
#5 0x00007e4e0debb391 in __cxa_throw () from /lib/x86_64-linux-gnu/libstdc++.so.6
#6 0x00005e12c30c9f2e in string_diff(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) [clone .cold] ()
#7 0x00005e12c31e0b6c in common_chat_msg_diff::compute_diffs(common_chat_msg const&, common_chat_msg const&) ()
#8 0x00005e12c313e6eb in server_slot::update_chat_msg(std::vector<common_chat_msg_diff, std::allocator<common_chat_msg_diff> >&) ()
#9 0x00005e12c313ee1f in server_context::send_partial_response(server_slot&, completion_token_output const&) ()
#10 0x00005e12c313f5c5 in server_context::process_token(completion_token_output&, server_slot&) ()
#11 0x00005e12c315978c in server_context::update_slots() ()
#12 0x00005e12c311dbc5 in server_queue::start_loop() ()
#13 0x00005e12c30e4f5e in main ()
Running llama.cpp server inference with streaming disabled avoids server crashes in my case; however, there are some artifacts between tool calls.
I think the problem is that llama.cpp recognizes the tool call format as a Hermes 2 Pro template:
// common/chat.cpp
// [...]
// Hermes 2/3 Pro, Qwen 2.5 Instruct (w/ tools)
if (src.find("<tool_call>") != std::string::npos && params.json_schema.is_null()) {
return common_chat_params_init_hermes_2_pro(tmpl, params);
}
// [...]
At a quick glance, "Hermes 2 Pro" format looks different from the built‑in Jinja template.
Yeah I agreed that would be inconsistent, you are right, but that's just things aren't that bad on my side... until I reach those 30K+ tokens!
Testing with removing the cache-type
params.
Q4 UD Quant
Prompt 1: Success
Prompt 2: Failed (infinite looping; since it's running way way slower, not going to wait for the full failure loop)
Q5 UD Quant
Prompt 1: Failed
Prompt 2: Not running since the first prompt is a way simpler task.
Q6 UD Quant
Prompt 1: Failed
Prompt 2: Not running since the first prompt is a way simpler task.
Sadly, still a very poor performance without the cache-type params set.
--
I'm going to play around with Qwen-Code (hopefully you can use it with the local model) and see if that gives any better results.
Testing in qwen-code... hoping for a better outcome!
Prompt 1:
Not sure if I'm supposed to be seeing that html-like and json code.
Keeps getting worse...
Seemed like it was maybe going to do it... but eventually gave up and failed.
Darn, I was really hoping qwen-code would at least give me some hope. But if it's not working in qwen-code, then I mean what's the odds it is ever going to work outside of it.
Out of curiosity, I had to try Qwen3 Non Thinking Model in qwen-coder:
Prompt 2:
And...
Perfect!
It's not rampart at all (probably referenced the Flappy Bird game I had in the directory), but at least it's a game and it works and has an end screen?
Just absolutely destroys Qwen3 Coder with everything relating to tool calling.
Someone please hack the code to inject the coding knowledge of Qwen3 Coder into the brains of Qwen3 Non Think and call it a job well done! :)
At this moment, how to fix this?
Latest llama.cpp build, latest unsloth quant gguf. Error from first message. =(
At this moment, how to fix this?
Latest llama.cpp build, latest unsloth quant gguf. Error from first message. =(
Make sure to also use the updated template from https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507/blob/main/chat_template.jinja.
Thank you!
Could you provide an example where this is falling - that would be very helpful thank you!
Is there any work being done by qwen team or you guys to improve tool calling issues or is it expected to download gguf feom here and get jinja from instruct version https://huggingface.co/unsloth/Qwen3-30B-A3B-Instruct-2507/blob/main/chat_template.jinja and use qwen coder only