Add files using upload-large-folder tool
Browse files
README.md
CHANGED
@@ -1,11 +1,18 @@
|
|
1 |
---
|
2 |
tags:
|
3 |
- unsloth
|
|
|
|
|
|
|
|
|
4 |
base_model:
|
5 |
- Qwen/Qwen3-8B
|
6 |
-
license: apache-2.0
|
7 |
---
|
|
|
8 |
# Qwen3-8B
|
|
|
|
|
|
|
9 |
|
10 |
## Qwen3 Highlights
|
11 |
|
@@ -87,21 +94,23 @@ print("thinking content:", thinking_content)
|
|
87 |
print("content:", content)
|
88 |
```
|
89 |
|
90 |
-
For deployment, you can use `
|
91 |
-
-
|
92 |
```shell
|
93 |
-
|
94 |
```
|
95 |
-
-
|
96 |
```shell
|
97 |
-
|
98 |
```
|
99 |
|
|
|
|
|
100 |
## Switching Between Thinking and Non-Thinking Mode
|
101 |
|
102 |
> [!TIP]
|
103 |
-
> The `enable_thinking` switch is also available in APIs created by
|
104 |
-
> Please refer to our documentation for [
|
105 |
|
106 |
### `enable_thinking=True`
|
107 |
|
@@ -199,7 +208,7 @@ if __name__ == "__main__":
|
|
199 |
print(f"Bot: {response_3}")
|
200 |
```
|
201 |
|
202 |
-
>
|
203 |
> For API compatibility, when `enable_thinking=True`, regardless of whether the user uses `/think` or `/no_think`, the model will always output a block wrapped in `<think>...</think>`. However, the content inside this block may be empty if thinking is disabled.
|
204 |
> When `enable_thinking=False`, the soft switches are not valid. Regardless of any `/think` or `/no_think` tags input by the user, the model will not generate think content and will not include a `<think>...</think>` block.
|
205 |
|
|
|
1 |
---
|
2 |
tags:
|
3 |
- unsloth
|
4 |
+
library_name: transformers
|
5 |
+
license: apache-2.0
|
6 |
+
license_link: https://huggingface.co/Qwen/Qwen3-8B/blob/main/LICENSE
|
7 |
+
pipeline_tag: text-generation
|
8 |
base_model:
|
9 |
- Qwen/Qwen3-8B
|
|
|
10 |
---
|
11 |
+
|
12 |
# Qwen3-8B
|
13 |
+
<a href="https://chat.qwen.ai/" target="_blank" style="margin: 2px;">
|
14 |
+
<img alt="Chat" src="https://img.shields.io/badge/%F0%9F%92%9C%EF%B8%8F%20Qwen%20Chat%20-536af5" style="display: inline-block; vertical-align: middle;"/>
|
15 |
+
</a>
|
16 |
|
17 |
## Qwen3 Highlights
|
18 |
|
|
|
94 |
print("content:", content)
|
95 |
```
|
96 |
|
97 |
+
For deployment, you can use `sglang>=0.4.6.post1` or `vllm>=0.8.4` or to create an OpenAI-compatible API endpoint:
|
98 |
+
- SGLang:
|
99 |
```shell
|
100 |
+
python -m sglang.launch_server --model-path Qwen/Qwen3-8B --reasoning-parser qwen3
|
101 |
```
|
102 |
+
- vLLM:
|
103 |
```shell
|
104 |
+
vllm serve Qwen/Qwen3-8B --enable-reasoning --reasoning-parser deepseek_r1
|
105 |
```
|
106 |
|
107 |
+
For local use, applications such as llama.cpp, Ollama, LMStudio, and MLX-LM have also supported Qwen3.
|
108 |
+
|
109 |
## Switching Between Thinking and Non-Thinking Mode
|
110 |
|
111 |
> [!TIP]
|
112 |
+
> The `enable_thinking` switch is also available in APIs created by SGLang and vLLM.
|
113 |
+
> Please refer to our documentation for [SGLang](https://qwen.readthedocs.io/en/latest/deployment/sglang.html#thinking-non-thinking-modes) and [vLLM](https://qwen.readthedocs.io/en/latest/deployment/vllm.html#thinking-non-thinking-modes) users.
|
114 |
|
115 |
### `enable_thinking=True`
|
116 |
|
|
|
208 |
print(f"Bot: {response_3}")
|
209 |
```
|
210 |
|
211 |
+
> [!NOTE]
|
212 |
> For API compatibility, when `enable_thinking=True`, regardless of whether the user uses `/think` or `/no_think`, the model will always output a block wrapped in `<think>...</think>`. However, the content inside this block may be empty if thinking is disabled.
|
213 |
> When `enable_thinking=False`, the soft switches are not valid. Regardless of any `/think` or `/no_think` tags input by the user, the model will not generate think content and will not include a `<think>...</think>` block.
|
214 |
|