SGPT-125M-weightedmean-nli-bitfit
Usage
For usage instructions, refer to our codebase: https://github.com/Muennighoff/sgpt
Evaluation Results
For eval results, refer to the eval folder or our paper: https://arxiv.org/abs/2202.08904
Training
The model was trained with the parameters:
DataLoader:
sentence_transformers.datasets.NoDuplicatesDataLoader.NoDuplicatesDataLoader
of length 8807 with parameters:
{'batch_size': 64}
Loss:
sentence_transformers.losses.MultipleNegativesRankingLoss.MultipleNegativesRankingLoss
with parameters:
{'scale': 20.0, 'similarity_fct': 'cos_sim'}
Parameters of the fit()-Method:
{
"epochs": 1,
"evaluation_steps": 880,
"evaluator": "sentence_transformers.evaluation.EmbeddingSimilarityEvaluator.EmbeddingSimilarityEvaluator",
"max_grad_norm": 1,
"optimizer_class": "<class 'transformers.optimization.AdamW'>",
"optimizer_params": {
"lr": 0.0002
},
"scheduler": "WarmupLinear",
"steps_per_epoch": null,
"warmup_steps": 881,
"weight_decay": 0.01
}
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 75, 'do_lower_case': False}) with Transformer model: GPTNeoModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': True, 'pooling_mode_lasttoken': False})
)
Citing & Authors
@article{muennighoff2022sgpt,
title={SGPT: GPT Sentence Embeddings for Semantic Search},
author={Muennighoff, Niklas},
journal={arXiv preprint arXiv:2202.08904},
year={2022}
}
- Downloads last month
- 1,107
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Spaces using Muennighoff/SGPT-125M-weightedmean-nli-bitfit 13
Evaluation results
- accuracy on MTEB AmazonCounterfactualClassification (en)test set self-reported65.881
- ap on MTEB AmazonCounterfactualClassification (en)test set self-reported28.685
- f1 on MTEB AmazonCounterfactualClassification (en)test set self-reported59.800
- accuracy on MTEB AmazonCounterfactualClassification (de)test set self-reported59.079
- ap on MTEB AmazonCounterfactualClassification (de)test set self-reported73.919
- f1 on MTEB AmazonCounterfactualClassification (de)test set self-reported56.632
- accuracy on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported64.918
- ap on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported16.361
- f1 on MTEB AmazonCounterfactualClassification (en-ext)test set self-reported53.127
- accuracy on MTEB AmazonCounterfactualClassification (ja)test set self-reported56.424