littlebird13 commited on
Commit
99cabfa
·
verified ·
1 Parent(s): 1c68a84

Upload folder using huggingface_hub

Browse files
Files changed (4) hide show
  1. README.md +181 -3
  2. config.json +2 -2
  3. generation_config.json +2 -9
  4. model.safetensors +2 -2
README.md CHANGED
@@ -1,3 +1,181 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Qwen3-Embedding-0.6B
2
+
3
+ <p align="center">
4
+ <img src="https://qianwen-res.oss-accelerate-overseas.aliyuncs.com/logo_qwen3.png" width="400"/>
5
+ <p>
6
+
7
+ ## Highlights
8
+
9
+ The Qwen3 Embedding series model is the latest proprietary model of the Qwen family, specifically designed for text embedding and ranking tasks. Building upon the dense foundational models of the Qwen3 series, it provides a comprehensive range of text embeddings and reranking models in various sizes (0.6B, 4B, and 8B). This series inherits the exceptional multilingual capabilities, long-text understanding, and reasoning skills of its foundational model. The Qwen3 Embedding series represents significant advancements in multiple text embedding and ranking tasks, including text retrieval, code retrieval, text classification, text clustering, and bitext mining.
10
+
11
+ **Exceptional Versatility**: The embedding model has achieved state-of-the-art performance across a wide range of downstream application evaluations. The 8B size embedding model ranks **No.1** in the MTEB multilingual leaderboard (as of May 26, 2025, score **70.58**), while the reranking model excels in various text retrieval scenarios.
12
+
13
+ **Comprehensive Flexibility**: The Qwen3 Embedding series offers a full spectrum of sizes (from 0.6B to 8B) for both embedding and reranking models, catering to diverse use cases that prioritize efficiency and effectiveness. Developers can seamlessly combine these two modules. Additionally, the embedding model allows for flexible vector definitions across all dimensions, and both embedding and reranking models support user-defined instructions to enhance performance for specific tasks, languages, or scenarios.
14
+
15
+ **Multilingual Capability**: The Qwen3 Embedding series support over 100 languages, including various programming languages, and provides robust multilingual, cross-lingual, and code retrieval capabilities.
16
+
17
+ ## Model Overview
18
+
19
+ **Qwen3-Embedding-0.6B** has the following features:
20
+
21
+ - Model Type: Text Embedding
22
+ - Supported Languages: 100+ Languages
23
+ - Number of Paramaters: 0.6B
24
+ - Context Length: 32k
25
+ - Embedding Dimension: Up to 1024, supports user-defined output dimensions ranging from 32 to 1024
26
+
27
+ For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3-Embedding/), [GitHub](https://github.com/QwenLM/Qwen3-Embedding).
28
+
29
+ ## Qwen3 Embedding Series Model list
30
+
31
+ | Model Type | Models | Size | Layers | Sequence Length | Embedding Dimension | MRL Support | Instruct Aware |
32
+ |------------------|----------------------|------|--------|-----------------|---------------------|-------------|----------------|
33
+ | Text Embedding | [Qwen3-Embedding-0.6B](https://modelscope.cn/models/tongyi/Qwen3-Embedding-0.6B) | 0.6B | 28 | 32K | 1024 | Yes | Yes |
34
+ | Text Embedding | [Qwen3-Embedding-4B](https://modelscope.cn/models/tongyi/Qwen3-Embedding-4B) | 4B | 36 | 32K | 2560 | Yes | Yes |
35
+ | Text Embedding | [Qwen3-Embedding-8B](https://modelscope.cn/models/tongyi/Qwen3-Embedding-8B) | 8B | 36 | 32K | 4096 | Yes | Yes |
36
+ | Text Reranking | [Qwen3-Reranker-0.6B](https://modelscope.cn/models/tongyi/Qwen3-Reranker-0.6B) | 0.6B | 28 | 32K | - | - | Yes |
37
+ | Text Reranking | [Qwen3-Reranker-4B](https://modelscope.cn/models/tongyi/Qwen3-Reranker-4B) | 4B | 36 | 32K | - | - | Yes |
38
+ | Text Reranking | [Qwen3-Reranker-8B](https://modelscope.cn/models/tongyi/Qwen3-Reranker-8B) | 8B | 36 | 32K | - | - | Yes |
39
+
40
+ > **Note**:: `MRL Support` indicates whether the embedding model supports custom dimensions for the final embedding. `Instruct Aware` notes whether the embedding or reranking model supports customizing the input instruction according to different tasks.
41
+
42
+ ## Usage
43
+
44
+ With Transformers versions earlier than 4.51.0, you may encounter the following error:
45
+ ```
46
+ KeyError: 'qwen3'
47
+ ```
48
+
49
+ ### Transformers Usage
50
+
51
+ ```python
52
+ # Requires transformers>=4.51.0
53
+
54
+ import torch
55
+ import torch.nn.functional as F
56
+
57
+ from torch import Tensor
58
+ from transformers import AutoTokenizer, AutoModel
59
+
60
+
61
+ def last_token_pool(last_hidden_states: Tensor,
62
+ attention_mask: Tensor) -> Tensor:
63
+ left_padding = (attention_mask[:, -1].sum() == attention_mask.shape[0])
64
+ if left_padding:
65
+ return last_hidden_states[:, -1]
66
+ else:
67
+ sequence_lengths = attention_mask.sum(dim=1) - 1
68
+ batch_size = last_hidden_states.shape[0]
69
+ return last_hidden_states[torch.arange(batch_size, device=last_hidden_states.device), sequence_lengths]
70
+
71
+
72
+ def get_detailed_instruct(task_description: str, query: str) -> str:
73
+ return f'Instruct: {task_description}\nQuery:{query}'
74
+
75
+ def tokenize(tokenizer, input_texts, eod_id, max_length):
76
+ batch_dict = tokenizer(input_texts, padding=False, truncation=True, max_length=max_length-2)
77
+ for seq, att in zip(batch_dict["input_ids"], batch_dict["attention_mask"]):
78
+ seq.append(eod_id)
79
+ att.append(1)
80
+ batch_dict = tokenizer.pad(batch_dict, padding=True, return_tensors="pt")
81
+ return batch_dict
82
+
83
+ # Each query must come with a one-sentence instruction that describes the task
84
+ task = 'Given a web search query, retrieve relevant passages that answer the query'
85
+
86
+ queries = [
87
+ get_detailed_instruct(task, 'What is the capital of China?'),
88
+ get_detailed_instruct(task, 'Explain gravity')
89
+ ]
90
+ # No need to add instruction for retrieval documents
91
+ documents = [
92
+ "The capital of China is Beijing.",
93
+ "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun."
94
+ ]
95
+ input_texts = queries + documents
96
+
97
+ tokenizer = AutoTokenizer.from_pretrained('tongyi/Qwen3-Embedding-0.6B', padding_side='left')
98
+ model = AutoModel.from_pretrained('tongyi/Qwen3-Embedding-0.6B')
99
+
100
+ # We recommend enabling flash_attention_2 for better acceleration and memory saving.
101
+ # model = AutoModel.from_pretrained('tongyi/Qwen3-Embedding-0.6B', attn_implementation="flash_attention_2", torch_dtype=torch.float16).cuda()
102
+
103
+ eod_id = tokenizer.convert_tokens_to_ids("<|endoftext|>")
104
+ max_length = 8192
105
+
106
+ # Tokenize the input texts
107
+ batch_dict = tokenize(tokenizer, input_texts, eod_id, max_length)
108
+ batch_dict.to(model.device)
109
+ outputs = model(**batch_dict)
110
+ embeddings = last_token_pool(outputs.last_hidden_state, batch_dict['attention_mask'])
111
+
112
+ # normalize embeddings
113
+ embeddings = F.normalize(embeddings, p=2, dim=1)
114
+ scores = (embeddings[:2] @ embeddings[2:].T)
115
+ print(scores.tolist())
116
+ ```
117
+ 📌 **Tip**: We recommend that developers customize the `instruct` according to their specific scenarios, tasks, and languages. Our tests have shown that in most retrieval scenarios, not using an `instruct` on the query side can lead to a drop in retrieval performance by approximately 1% to 5%.
118
+
119
+ ## Evaluation
120
+
121
+ ### MTEB (Multilingual)
122
+
123
+ | Model | Size | Mean (Task) | Mean (Type) | Bitxt Mining | Class. | Clust. | Inst. Retri. | Multi. Class. | Pair. Class. | Rerank | Retri. | STS |
124
+ |----------------------------------|:-------:|:-------------:|:-------------:|:--------------:|:--------:|:--------:|:--------------:|:---------------:|:--------------:|:--------:|:--------:|:------:|
125
+ | NV-Embed-v2 | 7B | 56.29 | 49.58 | 57.84 | 57.29 | 40.80 | 1.04 | 18.63 | 78.94 | 63.82 | 56.72 | 71.10|
126
+ | GritLM-7B | 7B | 60.92 | 53.74 | 70.53 | 61.83 | 49.75 | 3.45 | 22.77 | 79.94 | 63.78 | 58.31 | 73.33|
127
+ | BGE-M3 | 0.6B | 59.56 | 52.18 | 79.11 | 60.35 | 40.88 | -3.11 | 20.1 | 80.76 | 62.79 | 54.60 | 74.12|
128
+ | multilingual-e5-large-instruct | 0.6B | 63.22 | 55.08 | 80.13 | 64.94 | 50.75 | -0.40 | 22.91 | 80.86 | 62.61 | 57.12 | 76.81|
129
+ | gte-Qwen2-1.5B-instruct | 1.5B | 59.45 | 52.69 | 62.51 | 58.32 | 52.05 | 0.74 | 24.02 | 81.58 | 62.58 | 60.78 | 71.61|
130
+ | gte-Qwen2-7b-Instruct | 7B | 62.51 | 55.93 | 73.92 | 61.55 | 52.77 | 4.94 | 25.48 | 85.13 | 65.55 | 60.08 | 73.98|
131
+ | text-embedding-3-large | - | 58.93 | 51.41 | 62.17 | 60.27 | 46.89 | -2.68 | 22.03 | 79.17 | 63.89 | 59.27 | 71.68|
132
+ | Cohere-embed-multilingual-v3.0 | - | 61.12 | 53.23 | 70.50 | 62.95 | 46.89 | -1.89 | 22.74 | 79.88 | 64.07 | 59.16 | 74.80|
133
+ | Gemini Embedding | - | 68.37 | 59.59 | 79.28 | 71.82 | 54.59 | 5.18 | **29.16** | 83.63 | 65.58 | 67.71 | 79.40|
134
+ | **Qwen3-Embedding-0.6B** | 0.6B | 64.33 | 56.00 | 72.22 | 66.83 | 52.33 | 5.09 | 24.59 | 80.83 | 61.41 | 64.64 | 76.17|
135
+ | **Qwen3-Embedding-4B** | 4B | 69.45 | 60.86 | 79.36 | 72.33 | 57.15 | **11.56** | 26.77 | 85.05 | 65.08 | 69.60 | 80.86|
136
+ | **Qwen3-Embedding-8B** | 8B | **70.58** | **61.69** | **80.89** | **74.00** | **57.65** | 10.06 | 28.66 | **86.40** | **65.63** | **70.88** | **81.08** |
137
+
138
+ > **Note**: For compared models, the scores are retrieved from MTEB online [leaderboard](https://huggingface.co/spaces/mteb/leaderboard) on May 24th, 2025.
139
+
140
+ ### MTEB (Eng v2)
141
+
142
+ | MTEB English / Models | Param. | Mean(Task) | Mean(Type) | Class. | Clust. | Pair Class. | Rerank. | Retri. | STS | Summ. |
143
+ |--------------------------------|:--------:|:------------:|:------------:|:--------:|:--------:|:-------------:|:---------:|:--------:|:-------:|:-------:|
144
+ | multilingual-e5-large-instruct | 0.6B | 65.53 | 61.21 | 75.54 | 49.89 | 86.24 | 48.74 | 53.47 | 84.72 | 29.89 |
145
+ | NV-Embed-v2 | 7.8B | 69.81 | 65.00 | 87.19 | 47.66 | 88.69 | 49.61 | 62.84 | 83.82 | 35.21 |
146
+ | GritLM-7B | 7.2B | 67.07 | 63.22 | 81.25 | 50.82 | 87.29 | 49.59 | 54.95 | 83.03 | 35.65 |
147
+ | gte-Qwen2-1.5B-instruct | 1.5B | 67.20 | 63.26 | 85.84 | 53.54 | 87.52 | 49.25 | 50.25 | 82.51 | 33.94 |
148
+ | stella_en_1.5B_v5 | 1.5B | 69.43 | 65.32 | 89.38 | 57.06 | 88.02 | 50.19 | 52.42 | 83.27 | 36.91 |
149
+ | gte-Qwen2-7B-instruct | 7.6B | 70.72 | 65.77 | 88.52 | 58.97 | 85.9 | 50.47 | 58.09 | 82.69 | 35.74 |
150
+ | gemini-embedding-exp-03-07 | - | 73.3 | 67.67 | 90.05 | 59.39 | 87.7 | 48.59 | 64.35 | 85.29 | 38.28 |
151
+ | **Qwen3-Embedding-0.6B** | 0.6B | 70.70 | 64.88 | 85.76 | 54.05 | 84.37 | 48.18 | 61.83 | 86.57 | 33.43 |
152
+ | **Qwen3-Embedding-4B** | 4B | 74.60 | 68.10 | 89.84 | 57.51 | 87.01 | 50.76 | 68.46 | 88.72 | 34.39 |
153
+ | **Qwen3-Embedding-8B** | 8B | 75.22 | 68.71 | 90.43 | 58.57 | 87.52 | 51.56 | 69.44 | 88.58 | 34.83 |
154
+
155
+ ### C-MTEB (MTEB Chinese)
156
+
157
+ | C-MTEB | Param. | Mean(Task) | Mean(Type) | Class. | Clust. | Pair Class. | Rerank. | Retr. | STS |
158
+ |------------------|--------|------------|------------|--------|--------|-------------|---------|-------|-------|
159
+ | multilingual-e5-large-instruct | 0.6B | 58.08 | 58.24 | 69.80 | 48.23 | 64.52 | 57.45 | 63.65 | 45.81 |
160
+ | bge-multilingual-gemma2 | 9B | 67.64 | 75.31 | 59.30 | 86.67 | 68.28 | 73.73 | 55.19 | - |
161
+ | gte-Qwen2-1.5B-instruct | 1.5B | 67.12 | 67.79 | 72.53 | 54.61 | 79.5 | 68.21 | 71.86 | 60.05 |
162
+ | gte-Qwen2-7B-instruct | 7.6B | 71.62 | 72.19 | 75.77 | 66.06 | 81.16 | 69.24 | 75.70 | 65.20 |
163
+ | ritrieve_zh_v1 | 0.3B | 72.71 | 73.85 | 76.88 | 66.5 | 85.98 | 72.86 | 76.97 | 63.92 |
164
+ | **Qwen3-Embedding-0.6B** | 0.6B | 66.33 | 67.45 | 71.40 | 68.74 | 76.42 | 62.58 | 71.03 | 54.52 |
165
+ | **Qwen3-Embedding-4B** | 4B | 72.27 | 73.51 | 75.46 | 77.89 | 83.34 | 66.05 | 77.03 | 61.26 |
166
+ | **Qwen3-Embedding-8B** | 8B | 73.84 | 75.00 | 76.97 | 80.08 | 84.23 | 66.99 | 78.21 | 63.53 |
167
+
168
+
169
+ ## Citation
170
+
171
+ If you find our work helpful, feel free to give us a cite.
172
+
173
+ ```
174
+ @misc{qwen3-embedding,
175
+ title = {Qwen3-Embedding},
176
+ url = {https://qwenlm.github.io/blog/qwen3/},
177
+ author = {Qwen Team},
178
+ month = {May},
179
+ year = {2025}
180
+ }
181
+ ```
config.json CHANGED
@@ -5,13 +5,13 @@
5
  "attention_bias": false,
6
  "attention_dropout": 0.0,
7
  "bos_token_id": 151643,
8
- "eos_token_id": 151645,
9
  "head_dim": 128,
10
  "hidden_act": "silu",
11
  "hidden_size": 1024,
12
  "initializer_range": 0.02,
13
  "intermediate_size": 3072,
14
- "max_position_embeddings": 40960,
15
  "max_window_layers": 28,
16
  "model_type": "qwen3",
17
  "num_attention_heads": 16,
 
5
  "attention_bias": false,
6
  "attention_dropout": 0.0,
7
  "bos_token_id": 151643,
8
+ "eos_token_id": 151643,
9
  "head_dim": 128,
10
  "hidden_act": "silu",
11
  "hidden_size": 1024,
12
  "initializer_range": 0.02,
13
  "intermediate_size": 3072,
14
+ "max_position_embeddings": 32768,
15
  "max_window_layers": 28,
16
  "model_type": "qwen3",
17
  "num_attention_heads": 16,
generation_config.json CHANGED
@@ -1,13 +1,6 @@
1
  {
2
  "bos_token_id": 151643,
3
- "do_sample": true,
4
- "eos_token_id": [
5
- 151645,
6
- 151643
7
- ],
8
- "pad_token_id": 151643,
9
- "temperature": 0.6,
10
- "top_k": 20,
11
- "top_p": 0.95,
12
  "transformers_version": "4.51.3"
13
  }
 
1
  {
2
  "bos_token_id": 151643,
3
+ "eos_token_id": 151643,
4
+ "max_new_tokens": 2048,
 
 
 
 
 
 
 
5
  "transformers_version": "4.51.3"
6
  }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:27cd75a405b9c1b46b59abfd88aaa209e6fed2a1972cde9b70e7659537c5e65b
3
- size 1191588280
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0437e45c94563b09e13cb7a64478fc406947a93cb34a7e05870fc8dcd48e23fd
3
+ size 1191586416