SentenceTransformer based on google-bert/bert-base-uncased

This is a sentence-transformers model finetuned from google-bert/bert-base-uncased. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: google-bert/bert-base-uncased
  • Maximum Sequence Length: 75 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 75, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("tartspuppy/bert-base-uncased-tsdae-encoder")
# Run inference
sentences = [
    'album five @ -, in an with Billboard magazine, said it was previously "something I wanted to revisit as been doing a while . "The medley a written whereas McCartney had worked the Beatles\' was made of "bits we had knocking . "The off with Vintage "McCartney sat one to looking back [and looking back . about life followed by the bass @ - @ led That Was Me, which is his school days and ",, "from there . songs "Feet the Clouds "about the inactivity while is up of ", about the life being a celebrity The final song medley, The End of ", written McCartney\'s unk> playing on his, Jim\'s piano',
    'The album features a five song @-@ medley , which in an interview with Billboard magazine , McCartney said that it was previously " something I wanted to revisit " as " nobody had been doing that for a while . " The medley was a group of intentionally written material , whereas McCartney had worked on the Beatles \' Abbey Road which , however , was actually made up of " bits we had knocking around . " The medley starts off with " Vintage Clothes " , which McCartney " sat down one day " to write , that was " looking back , [ and ] looking back . " , about life . It was followed by the bass @-@ led " That Was Me " , which is about his " school days and teachers " , the medley , as McCartney stated , then " progressed from there . " The next songs are " Feet in the Clouds " , about the inactivity while one is growing up , and " House of Wax " , about the life of being a celebrity . The final song in medley , " The End of the End " , was written at McCartney \'s <unk> Avenue home while playing on his father , Jim \'s , piano .',
    'Varanasi grew as an important industrial centre , famous for its muslin and silk <unk> , perfumes , ivory works , and sculpture . Buddha is believed to have founded Buddhism here around <unk> BC when he gave his first sermon , " The Setting in Motion of the Wheel of Dharma " , at nearby <unk> . The city \'s religious importance continued to grow in the 8th century , when Adi <unk> established the worship of Shiva as an official sect of Varanasi . Despite the Muslim rule , Varanasi remained the centre of activity for Hindu intellectuals and theologians during the Middle Ages , which further contributed to its reputation as a cultural centre of religion and education . <unk> Tulsidas wrote his epic poem on Lord Rama \'s life called Ram <unk> Manas in Varanasi . Several other major figures of the Bhakti movement were born in Varanasi , including Kabir and Ravidas . Guru Nanak Dev visited Varanasi for <unk> in <unk> , a trip that played a large role in the founding of <unk> . In the 16th century , Varanasi experienced a cultural revival under the Muslim Mughal emperor <unk> who invested in the city , and built two large temples dedicated to Shiva and Vishnu , though much of modern Varanasi was built during the 18th century , by the Maratha and <unk> kings . The kingdom of Benares was given official status by the <unk> in 1737 , and continued as a dynasty @-@ governed area until Indian independence in 1947 . The city is governed by the Varanasi Nagar Nigam ( Municipal Corporation ) and is represented in the Parliament of India by the current Prime Minister of India <unk> <unk> , who won the <unk> <unk> elections in 2014 by a huge margin . Silk weaving , carpets and crafts and tourism employ a significant number of the local population , as do the <unk> <unk> Works and Bharat Heavy <unk> Limited . Varanasi Hospital was established in 1964 .',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Semantic Similarity

Metric sts-dev sts-test
pearson_cosine 0.6552 0.7355
spearman_cosine 0.6641 0.732

Training Details

Training Dataset

Unnamed Dataset

  • Size: 21,196 training samples
  • Columns: text
  • Approximate statistics based on the first 1000 samples:
    text
    type string
    details
    • min: 6 tokens
    • mean: 51.01 tokens
    • max: 75 tokens
  • Samples:
    text
    To promote the album , Carey announced a world tour in April 2003 . As of 2003 , " Charmbracelet World Tour : An Intimate Evening with Mariah Carey " was her most extensive tour , lasting over eight months and performing sixty @-@ nine shows in venues worldwide . Before tickets went on sale in the US , venues were switched from large arenas to smaller , more intimate theater shows . According to Carey , the change was made in order to give fans a more intimate show , and something more Broadway @-@ influenced . She said , " It 's much more intimate so you 'll feel like you had an experience . You experience a night with me . " However , while smaller productions were booked for the US leg of the tour , Carey performed at stadia and arenas in Asia and Europe , and performed for a crowd of over 35 @,@ 000 in Manila , 50 @,@ 000 in Malaysia , and to over 70 @,@ 000 people in China . In the UK , it was Carey 's first tour to feature shows outside London ; she performed in Glasgow , Birming...
    By 1916 , these raiding forces were causing serious concern in the Admiralty as the proximity of Bruges to the British coast , to the troopship lanes across the English Channel and for the U @-@ boats , to the Western Approaches ; the heaviest shipping lanes in the World at the time . In the late spring of 1915 , Admiral Reginald had attempted without success to destroy the lock gates at Ostend with monitors . This effort failed , and Bruges became increasingly important in the Atlantic Campaign , which reached its height in 1917 . By early 1918 , the Admiralty was seeking ever more radical solutions to the problems raised by unrestricted submarine warfare , including instructing the " Allied Naval and Marine Forces " department to plan attacks on U @-@ boat bases in Belgium .
    PWI International Heavyweight Championship ( 1 time )
  • Loss: DenoisingAutoEncoderLoss

Evaluation Dataset

Unnamed Dataset

  • Size: 2,355 evaluation samples
  • Columns: text
  • Approximate statistics based on the first 1000 samples:
    text
    type string
    details
    • min: 4 tokens
    • mean: 51.08 tokens
    • max: 75 tokens
  • Samples:
    text
    Wilde 's two final comedies , An Ideal Husband and The Importance of Being Earnest , were still on stage in London at the time of his prosecution , and they were soon closed as the details of his case became public . After two years in prison with hard labour , Wilde went into exile in Paris , sick and depressed , his reputation destroyed in England . In 1898 , when no @-@ one else would , Leonard Smithers agreed with Wilde to publish the two final plays . Wilde proved to be a , sending detailed instructions on stage directions , character listings and the presentation of the book , and insisting that a from the first performance be reproduced inside . Ellmann argues that the proofs show a man " very much in command of himself and of the play " . Wilde 's name did not appear on the cover , it was " By the Author of Lady Windermere 's Fan " . His return to work was brief though , as he refused to write anything else , " I can write , but have lost the joy of writing " ...
    = = = = Ely Viaduct = = = =
    = = World War I = =
  • Loss: DenoisingAutoEncoderLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 3e-05
  • num_train_epochs: 100
  • warmup_ratio: 0.1
  • fp16: True
  • dataloader_num_workers: 2
  • load_best_model_at_end: True

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 3e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 100
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 2
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss sts-dev_spearman_cosine sts-test_spearman_cosine
-1 -1 - - 0.3173 -
0.6024 100 8.2676 - - -
1.2048 200 6.0396 - - -
1.8072 300 4.7794 - - -
2.4096 400 4.2732 - - -
3.0120 500 3.9759 - - -
3.6145 600 3.7263 - - -
4.2169 700 3.5471 - - -
4.8193 800 3.4097 - - -
5.4217 900 3.2513 - - -
6.0241 1000 3.1646 3.3052 0.7232 -
6.6265 1100 3.0129 - - -
7.2289 1200 2.9307 - - -
7.8313 1300 2.8372 - - -
8.4337 1400 2.7232 - - -
9.0361 1500 2.6845 - - -
9.6386 1600 2.546 - - -
10.2410 1700 2.4931 - - -
10.8434 1800 2.4064 - - -
11.4458 1900 2.3145 - - -
12.0482 2000 2.2715 3.1490 0.7177 -
12.6506 2100 2.1495 - - -
13.2530 2200 2.1164 - - -
13.8554 2300 2.0398 - - -
14.4578 2400 1.9538 - - -
15.0602 2500 1.9311 - - -
15.6627 2600 1.8264 - - -
16.2651 2700 1.7786 - - -
16.8675 2800 1.7256 - - -
17.4699 2900 1.6395 - - -
18.0723 3000 1.6082 3.4656 0.6894 -
18.6747 3100 1.5152 - - -
19.2771 3200 1.4678 - - -
19.8795 3300 1.425 - - -
20.4819 3400 1.3395 - - -
21.0843 3500 1.3203 - - -
21.6867 3600 1.2275 - - -
22.2892 3700 1.1955 - - -
22.8916 3800 1.1612 - - -
23.4940 3900 1.0792 - - -
24.0964 4000 1.0557 3.9473 0.6822 -
24.6988 4100 0.9793 - - -
25.3012 4200 0.9516 - - -
25.9036 4300 0.9095 - - -
26.5060 4400 0.8408 - - -
27.1084 4500 0.8338 - - -
27.7108 4600 0.7713 - - -
28.3133 4700 0.8312 - - -
28.9157 4800 0.8437 - - -
29.5181 4900 0.6952 - - -
30.1205 5000 0.6825 4.3702 0.6671 -
30.7229 5100 1.7624 - - -
31.3253 5200 6.9439 - - -
31.9277 5300 6.2218 - - -
32.5301 5400 5.9866 - - -
33.1325 5500 5.8608 - - -
33.7349 5600 5.7661 - - -
34.3373 5700 5.7114 - - -
34.9398 5800 5.6526 - - -
35.5422 5900 5.5982 - - -
36.1446 6000 5.5632 5.6696 0.7876 -
36.7470 6100 5.5455 - - -
37.3494 6200 5.4853 - - -
37.9518 6300 5.4709 - - -
38.5542 6400 5.4372 - - -
39.1566 6500 5.405 - - -
39.7590 6600 5.4011 - - -
40.3614 6700 5.3779 - - -
40.9639 6800 5.3684 - - -
41.5663 6900 5.3462 - - -
42.1687 7000 5.335 5.5090 0.7515 -
42.7711 7100 5.3273 - - -
43.3735 7200 5.3078 - - -
43.9759 7300 5.3005 - - -
44.5783 7400 5.2836 - - -
45.1807 7500 5.2732 - - -
45.7831 7600 5.2707 - - -
46.3855 7700 5.2525 - - -
46.9880 7800 5.2439 - - -
47.5904 7900 5.2316 - - -
48.1928 8000 5.2121 5.4451 0.7316 -
48.7952 8100 5.2142 - - -
49.3976 8200 5.1939 - - -
50.0 8300 5.186 - - -
50.6024 8400 5.166 - - -
51.2048 8500 5.1727 - - -
51.8072 8600 5.1555 - - -
52.4096 8700 5.1538 - - -
53.0120 8800 5.1413 - - -
53.6145 8900 5.1343 - - -
54.2169 9000 5.1257 5.3939 0.7142 -
54.8193 9100 5.1183 - - -
55.4217 9200 5.116 - - -
56.0241 9300 5.0999 - - -
56.6265 9400 5.0922 - - -
57.2289 9500 5.0756 - - -
57.8313 9600 5.0792 - - -
58.4337 9700 5.061 - - -
59.0361 9800 5.0663 - - -
59.6386 9900 5.0493 - - -
60.2410 10000 5.0487 5.3613 0.7019 -
60.8434 10100 5.0462 - - -
61.4458 10200 5.0356 - - -
62.0482 10300 5.0379 - - -
62.6506 10400 5.0243 - - -
63.2530 10500 5.0091 - - -
63.8554 10600 5.0128 - - -
64.4578 10700 5.0099 - - -
65.0602 10800 5.0078 - - -
65.6627 10900 4.9965 - - -
66.2651 11000 4.9907 5.3310 0.6963 -
66.8675 11100 4.9918 - - -
67.4699 11200 4.9724 - - -
68.0723 11300 4.984 - - -
68.6747 11400 4.9689 - - -
69.2771 11500 4.9636 - - -
69.8795 11600 4.9622 - - -
70.4819 11700 4.9547 - - -
71.0843 11800 4.9527 - - -
71.6867 11900 4.9467 - - -
72.2892 12000 4.9397 5.3186 0.6832 -
72.8916 12100 4.9387 - - -
73.4940 12200 4.9299 - - -
74.0964 12300 4.9454 - - -
74.6988 12400 4.9267 - - -
75.3012 12500 4.9258 - - -
75.9036 12600 4.9244 - - -
76.5060 12700 4.9214 - - -
77.1084 12800 4.9125 - - -
77.7108 12900 4.9122 - - -
78.3133 13000 4.9108 5.3026 0.6840 -
78.9157 13100 4.9073 - - -
79.5181 13200 4.8944 - - -
80.1205 13300 4.8987 - - -
80.7229 13400 4.9013 - - -
81.3253 13500 4.8915 - - -
81.9277 13600 4.8883 - - -
82.5301 13700 4.8861 - - -
83.1325 13800 4.882 - - -
83.7349 13900 4.8812 - - -
84.3373 14000 4.8805 5.2968 0.6695 -
84.9398 14100 4.8839 - - -
85.5422 14200 4.8747 - - -
86.1446 14300 4.8652 - - -
86.7470 14400 4.8734 - - -
87.3494 14500 4.872 - - -
87.9518 14600 4.8621 - - -
88.5542 14700 4.8599 - - -
89.1566 14800 4.8649 - - -
89.7590 14900 4.8621 - - -
90.3614 15000 4.8483 5.2860 0.6694 -
90.9639 15100 4.8538 - - -
91.5663 15200 4.86 - - -
92.1687 15300 4.8463 - - -
92.7711 15400 4.8582 - - -
93.3735 15500 4.8444 - - -
93.9759 15600 4.8482 - - -
94.5783 15700 4.848 - - -
95.1807 15800 4.8489 - - -
95.7831 15900 4.8403 - - -
96.3855 16000 4.8425 5.2828 0.6641 -
96.9880 16100 4.8423 - - -
97.5904 16200 4.8377 - - -
98.1928 16300 4.8448 - - -
98.7952 16400 4.8384 - - -
99.3976 16500 4.8381 - - -
100.0 16600 4.8389 - - -
-1 -1 - - - 0.7320
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.12.9
  • Sentence Transformers: 4.0.1
  • Transformers: 4.50.1
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.5.2
  • Datasets: 3.4.1
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

DenoisingAutoEncoderLoss

@inproceedings{wang-2021-TSDAE,
    title = "TSDAE: Using Transformer-based Sequential Denoising Auto-Encoderfor Unsupervised Sentence Embedding Learning",
    author = "Wang, Kexin and Reimers, Nils and Gurevych, Iryna",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
    month = nov,
    year = "2021",
    address = "Punta Cana, Dominican Republic",
    publisher = "Association for Computational Linguistics",
    pages = "671--688",
    url = "https://arxiv.org/abs/2104.06979",
}
Downloads last month
9
Safetensors
Model size
109M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for tartspuppy/bert-base-uncased-tsdae-encoder

Finetuned
(5406)
this model

Evaluation results