SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Alibaba-NLP/gte-large-en-v1.5
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("sentence_transformers_model_id")
# Run inference
sentences = [
    'alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite radar flight gloves 35618185392x comfortable glove lightweight customized fit silicone grip patterning on fingers for improved riding control included items 2 gloves made with 46 synthetic suede 35 polyester 19 polyamide care instructions do not wash bleach tumble dry iron clean single layer clarino palm is breathable and offers excellent feel the bikes controls reinforced thumb construction increases durability gusset flexibility innovative stretch insert in area hand movement lever reinforcements third fourth added abrasion resistance convenient slipon design a secure singlepiece fabric upper gives perforated ergonomic chassis reduced material result supremely lightweight alpinestars automotive',
    'alpinestars 140 holdall gear bag alpinestars fl yellowredanthracite radar flight gloves 35618185392x comfortable glove lightweight customized fit silicone grip patterning on fingers for improved riding control included items 2 gloves made with 46 synthetic suede 35 polyester 19 polyamide care instructions do not wash bleach tumble dry iron clean single layer clarino palm is breathable and offers excellent feel the bikes controls reinforced thumb construction increases durability gusset flexibility innovative stretch insert in area hand movement lever reinforcements third fourth added abrasion resistance convenient slipon design a secure singlepiece fabric upper gives perforated ergonomic chassis reduced material result supremely lightweight alpinestars automotive',
    'td 8000k xenon hid kit high beam 0910 mercedes benz cl600 c216 h7 xenon hid lighting is only available on high end luxury cars you can convert your stock halogens to super bright too by just connecting a few plug and play connections then mounting the ballast in secure spot but with this mercedes cl600 low watt 8000k td hid high beam conversion kit experience supreme brightness expanded field of vision also our wattage systems are backed by full one year warrantyplease note will not work if cl600s headlights came equipped factory lights unlike cheaper market more consistently without fading out like coated bulbs dousually mercedes installations probably most common upgrades performed increase headlight cl600 producing certain temperatures technology that uses xenon gas charged bulb combination an electronic regulate current going through it the resulting light 35 be up 3 times brighter than traditional halogen bulbs kits reliably produce truer colored we offer conversion kit short for intensity discharge automotive',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 269,761 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 13 tokens
    • mean: 68.94 tokens
    • max: 1130 tokens
    • min: 12 tokens
    • mean: 70.35 tokens
    • max: 1149 tokens
  • Samples:
    anchor positive
    tripp lite 25u 4post open frame rack cabinet square holes 1000lb capacity open frame rack tripp 25u prices cnet tripp lite otherelectronics tripp lite 25u 4post open frame rack cabinet square holes 1000lb capacity open frame rack tripp 25u specs cnet null tripp lite otherelectronics
    headlamp restoration kit philips 2000 bmw 323ci base coupe lights and lenses page 6 note removes yellowing and haze of plastic headlight lenses restoring likenew condition and finish professional results in under 30 minutes can be used on headlights taillights turn signals and reflective lens covers with uv coating technology one kit restores two headlights contains qty 1 pretreatment 1 cleanerpolish 1 shine restorerpreserver 3 sandpaper 600 1500 2000 grit 10 applicator polish cloths 1 pair of vinyl gloves philips automotive headlamp restoration kit philips 1996 bmw 318i base convertible lights and lenses page 6 note removes yellowing and haze of plastic headlight lenses restoring likenew condition and finish professional results in under 30 minutes can be used on headlights taillights turn signals and reflective lens covers with uv coating technology one kit restores two headlights contains qty 1 pretreatment 1 cleanerpolish 1 shine restorerpreserver 3 sandpaper 600 1500 2000 grit 10 applicator polish cloths 1 pair of vinyl gloves philips automotive
    hose clamp 132146 mm range 12 width spring type 1991 bmw 325i base coupe cooling system miscellaneous page 1 mubea automotive hose clamp 132146 mm range 12 width spring type 1994 bmw 325i base convertible cooling system miscellaneous page 1 mubea automotive
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 67,441 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 11 tokens
    • mean: 74.02 tokens
    • max: 693 tokens
    • min: 15 tokens
    • mean: 74.68 tokens
    • max: 812 tokens
  • Samples:
    anchor positive
    bulb dashboard instruments with black socket base 12v 12w 1995 bmw 318ti hatchback lights and lenses page 3 genuine bmw automotive bulb dashboard instruments with black socket base 12v 12w 1999 bmw 323is coupe gauges miscellaneous page 1 osramsylvania automotive
    canon pixma mp282 high capacity black compatible ink cartridge ink volumeremanufactured pg512 black 18ml 1 cartridge 18ml officeproducts canon pixma mp282 high capacity black compatible ink cartridge cartridges inkrediblecouk 1 black ink cartridge 18ml officeproducts
    oring for camshaft position sensor 17 x 3 mm 2001 bmw 325i base wagon camshafts timing chains page 1 note 17 x 3mm uro automotive oring for crankshaft sensor 17 x 3 mm 2000 bmw 323ci base coupe sensors page 5 note 17 x 3mm uro automotive
  • Loss: CachedMultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • learning_rate: 1e-05
  • num_train_epochs: 2
  • warmup_ratio: 0.1
  • fp16: True
  • auto_find_batch_size: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 8
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: True
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • eval_use_gather_object: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss
0.2076 7000 0.012 0.0057
0.4152 14000 0.0044 0.0040
0.6228 21000 0.0038 0.0040
0.8303 28000 0.0033 0.0028
1.0379 35000 0.002 0.0025
1.2455 42000 0.0012 0.0022
1.4531 49000 0.0008 0.0021
1.6607 56000 0.0005 0.0021
1.8683 63000 0.0004 0.0020

Framework Versions

  • Python: 3.10.13
  • Sentence Transformers: 3.0.1
  • Transformers: 4.44.0
  • PyTorch: 2.2.1
  • Accelerate: 0.33.0
  • Datasets: 2.21.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CachedMultipleNegativesRankingLoss

@misc{gao2021scaling,
    title={Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup}, 
    author={Luyu Gao and Yunyi Zhang and Jiawei Han and Jamie Callan},
    year={2021},
    eprint={2101.06983},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
8
Safetensors
Model size
434M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for spl4shedEdu/gte_ESB

Finetuned
(21)
this model