Integrate with Sentence Transformers
Hello!
Preface
Congratulations on your #1 spot on MMTEB v2 (multilingual)! I was wondering if you trained with a Matryoshka-aware loss or not, i.e. if I can reduce the dimensionality without hurting the performance too much?
Pull Request overview
- Integrate with Sentence Transformers
- Add Sentence Transformers snippet to README/tags
Details
This PR adds the configuration required for Sentence Transformers to create Transformer, Mean Pooling, and Normalization modules. Usage becomes heavily simplified, especially if model.encode_query automatically uses the prompt called "query" (in the configs), and otherwise users can use model.encode(..., prompt="..."). It should also help a lot with discoverability!
You can test it all out before merging like so:
pip install sentence-transformers
from sentence_transformers import SentenceTransformer
attn_implementation = "eager" # Or "flash_attention_2"
model = SentenceTransformer(
"nvidia/llama-embed-nemotron-8b",
trust_remote_code=True,
model_kwargs={"attn_implementation": attn_implementation, "torch_dtype": "float16"},
tokenizer_kwargs={"padding_side": "left"},
revision="refs/pr/7",
)
queries = [
"How do neural networks learn patterns from examples?"
]
documents = [
"Deep learning models adjust their weights through backpropagation, using gradient descent to minimize error on training data and improve predictions over time.",
"Market prices are determined by the relationship between how much people want to buy a product and how much is available for sale, with scarcity driving prices up and abundance driving them down.",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
scores = (query_embeddings @ document_embeddings.T)
print(scores.tolist())
I'm able to reproduce the results from your snippet using pure transformers using this sentence-transformers snippet.
- Tom Aarsen
Thank you for a PR @tomaarsen , works nicely! And congrats on Sentence Transformers joining Hugging Face π
I was wondering if you trained with a Matryoshka-aware loss or not, i.e. if I can reduce the dimensionality without hurting the performance too much?
Unfortunately, we trained it without Matryoshka-aware loss