Qwen
/

Qwen3-Embedding-4B

Feature Extraction

sentence-transformers

Safetensors

text-generation-inference

Model card Files Files and versions

xet

Community

littlebird13

zyznull commited on 4 days ago

Commit

408b81b

verified ·

1 Parent(s): 636cd9b

add use case for vllm & modify Citation (#4)

Browse files

- add use case for vllm & modify Citation (2907c29945ffaa6297b9e7aca27a8eb9239a3a0a)

Co-authored-by: yanzhao <[email protected]>

Files changed (1) hide show

README.md +41 -6

README.md CHANGED Viewed

@@ -63,6 +63,7 @@ KeyError: 'qwen3'
 ```python
 # Requires transformers>=4.51.0
 from sentence_transformers import SentenceTransformer
@@ -165,6 +166,41 @@ scores = (embeddings[:2] @ embeddings[2:].T)
 print(scores.tolist())
 # [[0.7534257769584656, 0.1146894246339798], [0.03198453038930893, 0.6258305311203003]]
 ```
 📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
 ## Evaluation
@@ -222,11 +258,10 @@ print(scores.tolist())
 If you find our work helpful, feel free to give us a cite.
 ```
-@misc{qwen3-embedding,
-    title  = {Qwen3-Embedding},
-    url    = {https://qwenlm.github.io/blog/qwen3/},
-    author = {Qwen Team},
-    month  = {May},
-    year   = {2025}
 }
 ```

 ```python
 # Requires transformers>=4.51.0
+# Requires sentence-transformers>=2.7.0
 from sentence_transformers import SentenceTransformer
 print(scores.tolist())
 # [[0.7534257769584656, 0.1146894246339798], [0.03198453038930893, 0.6258305311203003]]
 ```
+### vLLM Usage
+```python
+# Requires vllm>=0.8.5
+import torch
+import vllm
+from vllm import LLM
+def get_detailed_instruct(task_description: str, query: str) -> str:
+    return f'Instruct: {task_description}\nQuery:{query}'
+# Each query must come with a one-sentence instruction that describes the task
+task = 'Given a web search query, retrieve relevant passages that answer the query'
+queries = [
+    get_detailed_instruct(task, 'What is the capital of China?'),
+    get_detailed_instruct(task, 'Explain gravity')
+]
+# No need to add instruction for retrieval documents
+documents = [
+    "The capital of China is Beijing.",
+    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
+]
+input_texts = queries + documents
+model = LLM(model="Qwen/Qwen3-Embedding-4B", task="embed")
+outputs = model.embed(input_texts)
+embeddings = torch.tensor([o.outputs.embedding for o in outputs])
+scores = (embeddings[:2] @ embeddings[2:].T)
+print(scores.tolist())
+# [[0.7525103688240051, 0.1143278032541275], [0.030893627554178238, 0.6239761114120483]]
+```
 📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
 ## Evaluation
 If you find our work helpful, feel free to give us a cite.
 ```
+@article{qwen3embedding,
+  title={Qwen3 Embedding: Advancing Text Embedding and Reranking Through Foundation Models},
+  author={Zhang, Yanzhao and Li, Mingxin and Long, Dingkun and Zhang, Xin and Lin, Huan and Yang, Baosong and Xie, Pengjun and Yang, An and Liu, Dayiheng and Lin, Junyang and Huang, Fei and Zhou, Jingren},
+  journal={arXiv preprint arXiv:2506.05176},
+  year={2025}
 }
 ```