littlebird13 zyznull commited on
Commit
a3d38e3
·
verified ·
1 Parent(s): 4e42393

add use case for vllm & modify Citation (#5)

Browse files

- add use case for vllm & modify Citation (7c0a850d147a83b59e7422f38fb6a672df0c52d5)


Co-authored-by: yanzhao <[email protected]>

Files changed (1) hide show
  1. README.md +36 -6
README.md CHANGED
@@ -61,6 +61,7 @@ KeyError: 'qwen3'
61
 
62
  ```python
63
  # Requires transformers>=4.51.0
 
64
 
65
  from sentence_transformers import SentenceTransformer
66
 
@@ -164,6 +165,36 @@ scores = (embeddings[:2] @ embeddings[2:].T)
164
  print(scores.tolist())
165
  # [[0.7493016123771667, 0.0750647559762001], [0.08795969933271408, 0.6318399906158447]]
166
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
167
  📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
168
 
169
  ## Evaluation
@@ -221,11 +252,10 @@ print(scores.tolist())
221
  If you find our work helpful, feel free to give us a cite.
222
 
223
  ```
224
- @misc{qwen3-embedding,
225
- title = {Qwen3-Embedding},
226
- url = {https://qwenlm.github.io/blog/qwen3/},
227
- author = {Qwen Team},
228
- month = {May},
229
- year = {2025}
230
  }
231
  ```
 
61
 
62
  ```python
63
  # Requires transformers>=4.51.0
64
+ # Requires sentence-transformers>=2.7.0
65
 
66
  from sentence_transformers import SentenceTransformer
67
 
 
165
  print(scores.tolist())
166
  # [[0.7493016123771667, 0.0750647559762001], [0.08795969933271408, 0.6318399906158447]]
167
  ```
168
+
169
+ ### vLLM Usage
170
+
171
+ ```python
172
+ # Requires vllm>=0.8.5
173
+ import torch
174
+ import vllm
175
+ from vllm import LLM
176
+ def get_detailed_instruct(task_description: str, query: str) -> str:
177
+ return f'Instruct: {task_description}\nQuery:{query}'
178
+ # Each query must come with a one-sentence instruction that describes the task
179
+ task = 'Given a web search query, retrieve relevant passages that answer the query'
180
+ queries = [
181
+ get_detailed_instruct(task, 'What is the capital of China?'),
182
+ get_detailed_instruct(task, 'Explain gravity')
183
+ ]
184
+ # No need to add instruction for retrieval documents
185
+ documents = [
186
+ "The capital of China is Beijing.",
187
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
188
+ ]
189
+ input_texts = queries + documents
190
+ model = LLM(model="Qwen/Qwen3-Embedding-8B", task="embed")
191
+ outputs = model.embed(input_texts)
192
+ embeddings = torch.tensor([o.outputs.embedding for o in outputs])
193
+ scores = (embeddings[:2] @ embeddings[2:].T)
194
+ print(scores.tolist())
195
+ # [[0.7482624650001526, 0.07556197047233582], [0.08875375241041183, 0.6300010681152344]]
196
+ ```
197
+
198
  📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
199
 
200
  ## Evaluation
 
252
  If you find our work helpful, feel free to give us a cite.
253
 
254
  ```
255
+ @article{qwen3embedding,
256
+ title={Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models},
257
+ author={Zhang, Yanzhao and Li, Mingxin and Long, Dingkun and Zhang, Xin and Lin, Huan and Yang, Baosong and Xie, Pengjun and Yang, An and Liu, Dayiheng and Lin, Junyang and Huang, Fei and Zhou, Jingren},
258
+ journal={arXiv preprint arXiv:2506.05176},
259
+ year={2025}
 
260
  }
261
  ```