Update README.md
Browse files
README.md
CHANGED
@@ -28,11 +28,6 @@ _**Weizhou Shen, Chenliang Li, Fanqi Wan, Shengyi Liao, Shaopeng Lai, Bo Zhang,
|
|
28 |
|
29 |
_Tongyi Lab, Alibaba Group_
|
30 |
|
31 |
-
<p align="center">
|
32 |
-
<img src="./assets/performance.png" width="100%"> <br>
|
33 |
-
</p>
|
34 |
-
|
35 |
-
|
36 |
|
37 |
|
38 |
</div>
|
@@ -42,9 +37,6 @@ _Tongyi Lab, Alibaba Group_
|
|
42 |
In this work, we present QwenLong-CPRS, a novel framework designed to optimize long-context processing through query-aware multi-granularity compression, outperforming RAG and sparse attention methods. Distinct from RAG's coarse chunk-level retrieval, it achieves precise information extraction via token-level content selection, enhancing accuracy. Unlike sparse attention (SA) requiring model retraining, it functions as a plug-and-play module compatible with any downstream LLMs while eliminating retraining demands. This dual advantage enables both fine-grained context optimization and seamless integration across architectures.
|
43 |
|
44 |
|
45 |
-
<p align="center">
|
46 |
-
<img src="./assets/concept.png" width="100%"> <br>
|
47 |
-
</p>
|
48 |
|
49 |
We implement QwenLong-CPRS with four key innovations:
|
50 |
* _**Controllable Context Optimization**_: Processes control prompts + queries to generate compact, task-specific context segments without retraining.
|
@@ -58,11 +50,6 @@ We implement QwenLong-CPRS with four key innovations:
|
|
58 |
|
59 |
|
60 |
|
61 |
-
|
62 |
-
<p align="center">
|
63 |
-
<img src="./assets/framework.png" width="100%"> <br>
|
64 |
-
</p>
|
65 |
-
|
66 |
## π News
|
67 |
|
68 |
- **May 26, 2025:** π₯ We release [π€ QwenLong-CPRS-7B](https://huggingface.co/Tongyi-Zhiwen/QwenLong-CPRS-7B), a 7B context compression model designed for explicit long-context optimization.
|
@@ -75,28 +62,6 @@ We implement QwenLong-CPRS with four key innovations:
|
|
75 |
|
76 |
|
77 |
|
78 |
-
|
79 |
-
## π― Model Results
|
80 |
-
|
81 |
-
Here are the evaluation results.
|
82 |
-
|
83 |
-
<p align="center">
|
84 |
-
<img src="./assets/main_res.png" width="100%"> <br>
|
85 |
-
</p>
|
86 |
-
|
87 |
-
<p align="center">
|
88 |
-
<img src="./assets/niah.png" width="100%"> <br>
|
89 |
-
</p>
|
90 |
-
|
91 |
-
<p align="center">
|
92 |
-
<img src="./assets/different_llm.png" width="100%"> <br>
|
93 |
-
</p>
|
94 |
-
|
95 |
-
|
96 |
-
<p align="center">
|
97 |
-
<img src="./assets/latency.png" width="100%"> <br>
|
98 |
-
</p>
|
99 |
-
|
100 |
## π οΈ Requirements
|
101 |
|
102 |
```bash
|
@@ -166,13 +131,6 @@ python infer.py \
|
|
166 |
```
|
167 |
|
168 |
|
169 |
-
## π Join the Community
|
170 |
-
Chinese users can scan QR codes to join DingTalk/WeChat groups.
|
171 |
-
|
172 |
-
| WeChat | DingTalk |
|
173 |
-
|----------|---------|
|
174 |
-
|  |  |
|
175 |
-
|
176 |
## π Citation
|
177 |
|
178 |
If you find this work is relevant with your research or applications, please feel free to cite our work!
|
|
|
28 |
|
29 |
_Tongyi Lab, Alibaba Group_
|
30 |
|
|
|
|
|
|
|
|
|
|
|
31 |
|
32 |
|
33 |
</div>
|
|
|
37 |
In this work, we present QwenLong-CPRS, a novel framework designed to optimize long-context processing through query-aware multi-granularity compression, outperforming RAG and sparse attention methods. Distinct from RAG's coarse chunk-level retrieval, it achieves precise information extraction via token-level content selection, enhancing accuracy. Unlike sparse attention (SA) requiring model retraining, it functions as a plug-and-play module compatible with any downstream LLMs while eliminating retraining demands. This dual advantage enables both fine-grained context optimization and seamless integration across architectures.
|
38 |
|
39 |
|
|
|
|
|
|
|
40 |
|
41 |
We implement QwenLong-CPRS with four key innovations:
|
42 |
* _**Controllable Context Optimization**_: Processes control prompts + queries to generate compact, task-specific context segments without retraining.
|
|
|
50 |
|
51 |
|
52 |
|
|
|
|
|
|
|
|
|
|
|
53 |
## π News
|
54 |
|
55 |
- **May 26, 2025:** π₯ We release [π€ QwenLong-CPRS-7B](https://huggingface.co/Tongyi-Zhiwen/QwenLong-CPRS-7B), a 7B context compression model designed for explicit long-context optimization.
|
|
|
62 |
|
63 |
|
64 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
## π οΈ Requirements
|
66 |
|
67 |
```bash
|
|
|
131 |
```
|
132 |
|
133 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
134 |
## π Citation
|
135 |
|
136 |
If you find this work is relevant with your research or applications, please feel free to cite our work!
|