Doubao-1.5-Embedding

Introduction

We introduce Doubao-1.5-Embedding, a powerful embedding model built on top of our pretrained LLM. It stands out for the following features:

  • Generalist: It achieves state-of-the-art performance on MTEB benchmark in both Chinese and English.

  • Reasoning Expertise: It excels in complex query understanding and reasoning, delivering state-of-the-art performance on BRIGHT benchmark.

  • Flexibility: It supports multiple embedding dimensions โ€” [2048,1024,512,256] โ€” with minimal performance degradation at lower dimensions.

The model's API will be publicly available soon. Please stay tuned! :)

Performance

MTEB_v2 (English)

Model AVG Classification Clustering Pair Classification Reranking Retrieval STS Summarization
Linq-Embed-Mistral 69.80 83.00 54.07 88.44 49.44 60.14 84.69 37.26
NV-Embed-v2 69.81 87.19 47.66 88.69 49.61 62.84 83.82 35.21
jasper-en-vision-language-v1 71.41 90.27 60.52 88.14 50.00 56.05 84.37 37.19
gemini-embedding-exp-03-07 73.30 90.05 59.39 87.70 48.59 64.35 85.29 38.28
RELLE 73.59 90.06 58.20 88.74 49.46 67.15 84.04 38.02
Doubao-1.5-Embedding 74.91 90.66 60.93 86.53 50.95 67.85 87.18 33.99

C-MTEB (Chinese)

Model AVG Classification Clustering Pair Classification Reranking Retrieval STS
bge-multilingual-gemma2 67.64 75.31 59.30 86.67 68.28 73.73 55.19
gte-Qwen2-7B-instruct 71.62 75.77 66.06 87.48 68.92 75.71 65.20
xiaobu-embedding-v2 72.36 76.53 65.17 91.87 72.58 76.50 64.18
Conan-embedding-v1 72.50 76.77 66.33 91.66 72.76 76.67 63.67
Conan-embedding-v2 74.24 76.47 68.84 92.44 74.41 78.31 65.48
Doubao-1.5-Embedding 74.45 79.22 69.57 88.49 67.98 79.64 66.67

BRIGHT

Model BRIGHT AVG
SFR-Embedding-Mistral 18.02
text-embedding-004 19.52
GritLM-7B 20.63
gte-Qwen1.5-7B-instruct 22.09
gte-Qwen2-7B-instruct 22.89
Doubao-1.5-Embedding 24.71
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support