File size: 8,624 Bytes
e1aa7ec ae6162d d5013c9 ae6162d 5ddc595 6e010f0 7355180 ae6162d 6a2d1dd ae6162d 8327fc3 e1aa7ec 2b0f876 9097af3 2b0f876 bd19003 e1aa7ec ea333b8 e1aa7ec 0c3e61d e1aa7ec 07b3e81 a8382b4 d01fce2 e1aa7ec d01fce2 e1aa7ec a8382b4 e1aa7ec fe02de1 e1aa7ec fe02de1 4e68008 fe02de1 a9c56df fe02de1 4e68008 fe02de1 4e68008 fe02de1 e1aa7ec 0a5d291 e1aa7ec |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 |
---
license: apache-2.0
datasets:
- agentica-org/DeepScaleR-Preview-Dataset
base_model:
- Vinnnf/Thinkless-1.5B-Warmup
pipeline_tag: text-generation
library_name: transformers
---
# Thinkless: LLM Learns When to Think

<table>
<thead>
</thead>
<tbody>
<tr>
<td>📄 <strong>Paper Link</strong></td>
<td><a href="http://arxiv.org/abs/2505.13379">ArXiv</a></td>
</tr>
<tr>
<td>💻 <strong>RL Code</strong></td>
<td><a href="https://github.com/VainF/Thinkless">VainF/Thinkless</a></td>
</tr>
<tr>
<td>💻 <strong>SFT Code</strong></td>
<td><a href="https://github.com/VainF/Reasoning-SFT">VainF/Reasoning-SFT</a></td>
</tr>
<tr>
<td>🤖 <strong>RL Model</strong></td>
<td><a href="https://huggingface.co/Vinnnf/Thinkless-1.5B-RL-DeepScaleR">Thinkless-1.5B-RL-DeepScaleR</a></td>
</tr>
<tr>
<td>🐣 <strong>Warmup Model</strong></td>
<td><a href="https://huggingface.co/Vinnnf/Thinkless-1.5B-Warmup">Thinkless-1.5B-Warmup</a></td>
</tr>
<tr>
<td>📊 <strong>Data for Warmup</strong></td>
<td><a href="https://huggingface.co/datasets/Vinnnf/Hybrid-OpenThoughts2-1M-1.5B">Hybrid-OpenThoughts2-1M-1.5B</a></td>
</tr>
<tr>
<td>📊 <strong>Data for RL</strong></td>
<td><a href="https://huggingface.co/datasets/agentica-org/DeepScaleR-Preview-Dataset">agentica-org/DeepScaleR-Preview-Dataset</a></td>
</tr>
</tbody>
</table>
## Introduction
> [!NOTE]
> ***Can LLMs learn when to think?***
We propose Thinkless, a learnable framework that empowers an LLM to adaptively select between short-form and long-form reasoning, based on both task complexity and the model's ability. Thinkless is trained under a reinforcement learning paradigm and employs two control tokens, \<short\> for concise responses and \<think\> for detailed reasoning. At the core of our method is a Decoupled Group Relative Policy Optimization (DeGRPO) algorithm, which decomposes the learning objective of hybrid reasoning into two components: (1) a control token loss that governs the selection of the reasoning mode, and (2) a response loss that improves the accuracy of the generated answers. This decoupled formulation enables fine-grained control over the contributions of each objective, stabilizing training and effectively preventing collapse observed in vanilla GRPO. Empirically, on several benchmarks such as Minerva Algebra, MATH-500, and GSM8K, Thinkless is able to reduce the usage of long-chain thinking by 50\% - 90\%, significantly reducing the computational cost of Reasoning Language Models.
## Pipeline

## QuickStart
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Vinnnf/Thinkless-1.5B-RL-DeepScaleR"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
instruction = "Please reason step by step, and put your final answer within \\boxed{}."
prompt = "The arithmetic mean of 7, 2, $x$ and 10 is 9. What is the value of $x$?"
# prompt = "What is the smallest positive perfect cube that can be written as the sum of three consecutive integers?"
# prompt = "How many r's are in the word \"strawberry\""
messages = [
{"role": "user", "content": f"{instruction}\n{prompt}"},
]
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
# text = text + "<think>" # Uncomment this to force thinking mode
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(
**model_inputs,
max_new_tokens=16384,
do_sample=True,
temperature=0.6,
top_p=0.95
)
generated_ids = [
output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
]
num_tokens = len(generated_ids[0])
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
think_mode = ("<think>" in response)
print(text+response)
print(f"\nThink Mode: {think_mode}")
print(f"Number of tokens: {num_tokens}")
```
## Case Study
**User:**
```
1+1=?
```
**Thinkless-1.5B:**
```
<short>
To solve the problem \(1 + 1\), we will follow a step-by-step approach:
1. **Identify the numbers involved**: We have the numbers 1 and 1.
2. **Understand the operation**: The operation we need to perform is addition.
3. **Perform the addition**: When we add 1 and 1 together, we get 2.
So, the final answer is \(\boxed{2}\).
```
**DeepSeek-R1:**
```
Okay, so I need to figure out what 1 plus 1 equals. Let me start by thinking about basic addition. When you add two numbers together, you're combining their values. So if I have one apple and someone gives me another apple, how many apples do I have in total? That should be two apples.
Wait, let me make sure I'm not missing something here. In the simplest terms, addition is taking two quantities and putting them together. So 1 plus 1 means combining a single unit with another single unit. If I use my fingers, holding up one finger on my left hand and one on my right, then when I put them together, I have two fingers up. That seems right.
But maybe there's a trick to this question. Sometimes problems that look easy might have a different answer depending on the context. For example, in binary, 1 + 1 equals 10. But the question doesn't specify a number system, so I should assume it's in the decimal system, which is the default. So in decimal, 1 + 1 is definitely 2.
Another way to check is by using the number line. Starting at 1, if I move one unit to the right, I land on 2. That confirms it. Or using basic arithmetic rules: the sum of 1 and 1 is the next number in the sequence after 1, which is 2.
I guess there's no ambiguity here. Unless the question is trying to be philosophical or something, but mathematically, it's straightforward. So the answer should be 2.
The sum of 1 and 1 is calculated as follows:
1 + 1 = 2
In decimal arithmetic, adding the numbers 1 and 1 results in the number 2.
```
---
**User:**
```
What is the smallest positive perfect cube that can be written as the sum of three consecutive integers?
```
**Thinkless-1.5B:**
Note: The long reasoning process is omitted.
```
<think>
Okay, let's try to figure out the smallest positive perfect cube that can be written as the sum of three consecutive integers. Hmm, let's start by understanding the problem step by step.
First, a perfect cube is a number raised to the power of 3. So, we need to find the smallest positive integer \( n \) such that \( n = a + (a+1) + (a+2) \), where \( a \) is an integer. Then, \( n \) should also be a perfect cube. Let me write that down:
\( n = a + (a+1) + (a+2) \)
...
Next, 27:
Is 27 a perfect cube? Yes, \( 3^3 = 27 \). Let's check:
...
**Final Answer**
The smallest positive perfect cube that can be written as the sum of three consecutive integers is \boxed{27}.
</think>
To find the smallest positive perfect cube that can be written as the sum of three consecutive integers, we start by expressing the sum of three consecutive integers. Let the three consecutive integers be \( a \), \( a+1 \), and \( a+2 \). Their sum is:
\[
a + (a+1) + (a+2) = 3a + 3
\]
This sum must be a perfect cube. Let \( n \) be the perfect cube, so we have:
\[
n = 3(a + 1)
\]
Thus, \( n \) must be a multiple of 3. The smallest perfect cubes are 1, 8, 27, 64, 125, etc. We check each of these to see if they can be written as \( 3(a + 1) \):
- For \( n = 1 \): \( 3(a + 1) = 1 \) gives \( a = -\frac{2}{3} \), not an integer.
- For \( n = 8 \): \( 3(a + 1) = 8 \) gives \( a = \frac{5}{3} \), not an integer.
- For \( n = 27 \): \( 3(a + 1) = 27 \) gives \( a = 8 \), which is an integer. The three consecutive integers are 8, 9, and 10, and their sum is \( 8 + 9 + 10 = 27 \).
Checking the next perfect cubes (64, 125, etc.) confirms they do not yield integer values for \( a \). Therefore, the smallest positive perfect cube that can be written as the sum of three consecutive integers is:
\[
\boxed{27}
\]
```
## Citation
If you find this work helpful, please cite:
```
@article{fang2025thinkless,
title={Thinkless: LLM Learns When to Think},
author={Fang, Gongfan and Ma, Xinyin and Wang, Xinchao},
journal={arXiv preprint arXiv:2505.13379},
year={2025}
}
```
|