SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Alibaba-NLP/gte-large-en-v1.5
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'b260iunvhp000i 768386242246 20 30 where number of lamps is 1 linear fluorescent ballasts and by wattage x 75w 99w bulbscom universal electronic ballast 120v to 277v for 2 f96t12 universal brand b260iunvhp000i toolsandhomeimprovement',
    'b260iunvhp000i 768386242246 10 50 where length is 10 under 18 bulbscom electronic t12 linear fluorescent ballasts universal electronic ballast 120v to 277v for 2 f96t12 universal brand b260iunvhp000i toolsandhomeimprovement',
    'danze 24 double towel bar danze products at efaucetscom towel bars bathroom accessories danze 24 double towel bar parma collection solid brass construction easy to install mounting hardware included matching faucet collection d446612bn toolsandhomeimprovement',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 281,362 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 17 tokens
    • mean: 81.45 tokens
    • max: 1180 tokens
    • min: 16 tokens
    • mean: 86.66 tokens
    • max: 1180 tokens
  • Samples:
    anchor positive
    clever lever extra giga punch scallop circle 35 inches clever wholesale darice this clever lever extra giga punch produces a clearcut scallop circle the craft punch is ideal for embellishing scrapbooks greeting cards invitations programs and many more paper crafts the scalloped circle is 35 inches in size 1 craft punch per package lvxgcp65 officeproducts clever lever extra giga punch scallop circle 35 inches clever wholesale darice this clever lever extra giga punch produces a clearcut scallop circle the craft punch is ideal for embellishing scrapbooks greeting cards invitations programs and many more paper crafts the scalloped circle is 35 inches in size 1 craft punch per package lvxgcp65 officeproducts
    strut front right shocks springs page 1 2002 bmw 325i base sedan suspension genuine bmw 31312282460boe automotive strut front right shocks springs page 1 2002 bmw 325i base sedan suspension note only for cars with sport suspension and m sport package sachs 31312282460m10 automotive
    herrold 40 drawer chest in dark walnutmango wood 792977257388 arreton 46quote in washed white oakantique brass sale home lighting fixtures lamps more online symbolizing achievement and rank the shield shape of this six drawer chest bears both historical and design significance built with craftsmens detail from dark walnutstained mango wood and mahogany veneers chest features curved sides smooth uttermost 25738upc792977257388 toolsandhomeimprovement herrold 40 drawer chest in dark walnutmango wood 792977257388 malthus 31quote in aged parchmentreclaimed mahogany sale home lighting fixtures lamps more online symbolizing achievement and rank the shield shape of this six drawer chest bears both historical and design significance built with craftsmens detail from dark walnutstained mango wood and mahogany veneers chest features curved sides smooth uttermost 25738upc792977257388 toolsandhomeimprovement
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 70,341 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 18 tokens
    • mean: 86.53 tokens
    • max: 1175 tokens
    • min: 17 tokens
    • mean: 86.39 tokens
    • max: 1176 tokens
  • Samples:
    anchor positive
    retro 70s furniture set armchairs chairs and vector image furniture images over 41 000 retro 70s furniture set armchairs chairs and sofas vector illustration eps 8 vector image 14149273 officeproducts retro 70s furniture set armchairs chairs and vector image setting images over 12 million retro 70s furniture set armchairs chairs and sofas vector illustration eps 8 vector image 14149273 officeproducts
    hp designjet 70 cartridges for ink jet printers quillcom ink volume 130 mlthis cartridge is not compatible with hp designjet t620 24in photo printer hp photosmart pro b9180 printer hp photosmart pro b8850 photo printer hp photosmart pro b8800 photo printerfaderesistant color provides superior results and brilliant truetolife images that last for generations 901680441 officeproducts hp designjet z2100 44 in cartridges for ink jet printers quillcom ink volume 130 mlthis cartridge is not compatible with hp designjet t620 24in photo printer hp photosmart pro b9180 printer hp photosmart pro b8850 photo printer hp photosmart pro b8800 photo printerfaderesistant color provides superior results and brilliant truetolife images that last for generations 901680441 officeproducts
    suspension strut assembly shocks springs page 1 1996 bmw 318i base convertible suspension note front left w sport suspension front left bilstein touring class 22172518int automotive suspension strut assembly shocks springs page 1 1997 bmw 318is base coupe suspension note front left w sport suspension front left bilstein touring class 22172518int automotive
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 1e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • fp16: True
  • auto_find_batch_size: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: True
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss
0.1990 7000 0.0076 0.0027
0.3981 14000 0.0022 0.0019
0.5971 21000 0.0016 0.0013
0.7961 28000 0.0013 0.0011
0.9951 35000 0.0012 0.0008
1.1942 42000 0.0007 0.0007
1.3932 49000 0.0004 0.0009
1.5922 56000 0.0004 0.0007
1.7912 63000 0.0003 0.0006

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.1
  • PyTorch: 2.2.1
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup}, 
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
10
Safetensors
Model size
435M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for spl4shedEdu/gte_ISM

Finetuned
(21)
this model