Eval
+-------------+------------+-----------------+---------------+-------+---------+---------+ | Model | Dataset | Metric | Subset | Num | Score | Cat.0 | +=============+============+=================+===============+=======+=========+=========+ | model | gpqa | AveragePass@1 | gpqa_extended | 50 | 0.34 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | gpqa | AveragePass@1 | gpqa_main | 50 | 0.32 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | gpqa | AveragePass@1 | gpqa_diamond | 50 | 0.32 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | gpqa | AveragePass@1 | OVERALL | 150 | 0.3267 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | gsm8k | AverageAccuracy | main | 50 | 0.76 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Act.EM | in_domain | 42 | 0.2619 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Act.EM | out_of_domain | 47 | 0.3617 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Act.EM | OVERALL | 89 | 0.3146 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Plan.EM | in_domain | 0 | 0 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Plan.EM | out_of_domain | 0 | 0 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Plan.EM | OVERALL | 0 | 0 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | F1 | in_domain | 42 | 0.2095 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | F1 | out_of_domain | 47 | 0.2527 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | F1 | OVERALL | 89 | 0.2323 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | HalluRate | in_domain | 42 | 0.119 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | HalluRate | out_of_domain | 47 | 0.0851 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | HalluRate | OVERALL | 89 | 0.1011 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Rouge-L | in_domain | 42 | 0.0394 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Rouge-L | out_of_domain | 47 | 0.0676 | default | +-------------+------------+-----------------+---------------+-------+---------+---------+ | model | tool_bench | Rouge-L | OVERALL | 89 | 0.0543 | - | +-------------+------------+-----------------+---------------+-------+---------+---------+
Use this model
with llama-cli
llama-cli -m Qwen3-4B-toolcall.Q4_K_M.gguf
with ollama
- edit a makefile named(Qwen3-4B-toolcall.Q4_K_M.txt) like:
FROM ./Qwen3-4B-toolcall.Q4_K_M TEMPLATE """<|im_start|>system You are a helpful assistant<|im_end|> <|im_start|>user {{ .Prompt }}<|im_end|> <|im_start|>assistant """
- then create a model using ollama
ollama create Qwen3-4B-toolcall.Q4_K_M -f Qwen3-4B-toolcall.Q4_K_M.txt
- then run it
ollama run Qwen3-4B-toolcall.Q4_K_M
- Downloads last month
- 599
Hardware compatibility
Log In
to view the estimation
4-bit
5-bit
8-bit
16-bit
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support