SentenceTransformer

This is a sentence-transformers model trained. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("Detomo/cl-nagoya-sup-simcse-ja-nss-v_1_0_6")
# Run inference
sentences = [
    '科目:ユニット及びその他。名称:テラス床再生木デッキ。',
    '科目:ユニット及びその他。名称:駐車ゾーンサイン。',
    '科目:ユニット及びその他。名称:#階 MWC、WWC他姿見鏡。',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 12,683 training samples
  • Columns: sentence and label
  • Approximate statistics based on the first 1000 samples:
    sentence label
    type string int
    details
    • min: 11 tokens
    • mean: 18.16 tokens
    • max: 54 tokens
    • 0: ~0.30%
    • 1: ~0.30%
    • 2: ~0.30%
    • 3: ~0.30%
    • 4: ~0.30%
    • 5: ~0.30%
    • 6: ~0.30%
    • 7: ~0.30%
    • 8: ~0.30%
    • 9: ~0.30%
    • 10: ~0.30%
    • 11: ~0.30%
    • 12: ~1.10%
    • 13: ~0.30%
    • 14: ~0.30%
    • 15: ~0.30%
    • 16: ~0.30%
    • 17: ~0.30%
    • 18: ~0.30%
    • 19: ~0.30%
    • 20: ~0.30%
    • 21: ~0.30%
    • 22: ~0.30%
    • 23: ~0.40%
    • 24: ~0.30%
    • 25: ~0.30%
    • 26: ~0.30%
    • 27: ~0.90%
    • 28: ~0.30%
    • 29: ~0.40%
    • 30: ~0.30%
    • 31: ~1.10%
    • 32: ~0.30%
    • 33: ~0.30%
    • 34: ~0.30%
    • 35: ~0.30%
    • 36: ~0.30%
    • 37: ~0.30%
    • 38: ~0.30%
    • 39: ~0.30%
    • 40: ~0.30%
    • 41: ~0.30%
    • 42: ~0.30%
    • 43: ~0.30%
    • 44: ~0.30%
    • 45: ~0.30%
    • 46: ~0.30%
    • 47: ~0.30%
    • 48: ~0.30%
    • 49: ~0.40%
    • 50: ~0.30%
    • 51: ~0.30%
    • 52: ~0.30%
    • 53: ~0.60%
    • 54: ~0.30%
    • 55: ~0.30%
    • 56: ~0.30%
    • 57: ~0.30%
    • 58: ~0.30%
    • 59: ~0.30%
    • 60: ~0.30%
    • 61: ~0.30%
    • 62: ~0.30%
    • 63: ~0.30%
    • 64: ~0.30%
    • 65: ~0.30%
    • 66: ~0.30%
    • 67: ~0.30%
    • 68: ~0.30%
    • 69: ~0.30%
    • 70: ~0.30%
    • 71: ~0.30%
    • 72: ~0.50%
    • 73: ~0.30%
    • 74: ~0.30%
    • 75: ~0.30%
    • 76: ~0.30%
    • 77: ~0.30%
    • 78: ~0.30%
    • 79: ~0.30%
    • 80: ~0.30%
    • 81: ~0.30%
    • 82: ~0.30%
    • 83: ~0.30%
    • 84: ~0.30%
    • 85: ~0.30%
    • 86: ~0.30%
    • 87: ~0.30%
    • 88: ~0.80%
    • 89: ~0.30%
    • 90: ~0.30%
    • 91: ~0.30%
    • 92: ~0.30%
    • 93: ~0.30%
    • 94: ~0.30%
    • 95: ~0.30%
    • 96: ~0.30%
    • 97: ~0.50%
    • 98: ~0.30%
    • 99: ~0.30%
    • 100: ~0.30%
    • 101: ~0.30%
    • 102: ~0.80%
    • 103: ~0.60%
    • 104: ~0.50%
    • 105: ~0.30%
    • 106: ~0.30%
    • 107: ~16.50%
    • 108: ~0.30%
    • 109: ~0.30%
    • 110: ~0.30%
    • 111: ~0.30%
    • 112: ~0.30%
    • 113: ~0.30%
    • 114: ~0.30%
    • 115: ~0.30%
    • 116: ~0.50%
    • 117: ~0.30%
    • 118: ~0.30%
    • 119: ~0.30%
    • 120: ~0.30%
    • 121: ~0.30%
    • 122: ~0.30%
    • 123: ~0.30%
    • 124: ~0.30%
    • 125: ~0.70%
    • 126: ~0.30%
    • 127: ~0.30%
    • 128: ~0.30%
    • 129: ~0.40%
    • 130: ~2.10%
    • 131: ~2.10%
    • 132: ~0.30%
    • 133: ~0.30%
    • 134: ~0.50%
    • 135: ~0.50%
    • 136: ~0.50%
    • 137: ~0.40%
    • 138: ~0.30%
    • 139: ~0.30%
    • 140: ~0.30%
    • 141: ~0.30%
    • 142: ~0.30%
    • 143: ~0.30%
    • 144: ~0.30%
    • 145: ~0.30%
    • 146: ~0.30%
    • 147: ~0.30%
    • 148: ~0.30%
    • 149: ~0.30%
    • 150: ~0.30%
    • 151: ~0.30%
    • 152: ~0.30%
    • 153: ~0.30%
    • 154: ~0.50%
    • 155: ~0.30%
    • 156: ~0.40%
    • 157: ~0.30%
    • 158: ~0.30%
    • 159: ~0.30%
    • 160: ~0.30%
    • 161: ~0.30%
    • 162: ~0.30%
    • 163: ~0.30%
    • 164: ~0.30%
    • 165: ~0.30%
    • 166: ~0.30%
    • 167: ~0.30%
    • 168: ~0.30%
    • 169: ~0.40%
    • 170: ~0.30%
    • 171: ~0.30%
    • 172: ~0.30%
    • 173: ~0.30%
    • 174: ~0.30%
    • 175: ~0.30%
    • 176: ~0.70%
    • 177: ~0.30%
    • 178: ~0.30%
    • 179: ~0.30%
    • 180: ~0.30%
    • 181: ~1.30%
    • 182: ~0.30%
    • 183: ~0.40%
    • 184: ~0.30%
    • 185: ~0.30%
    • 186: ~0.30%
    • 187: ~1.50%
    • 188: ~0.30%
    • 189: ~0.30%
    • 190: ~0.30%
    • 191: ~0.30%
    • 192: ~0.30%
    • 193: ~0.30%
    • 194: ~0.30%
    • 195: ~1.60%
    • 196: ~0.30%
    • 197: ~0.30%
    • 198: ~7.20%
    • 199: ~0.30%
    • 200: ~1.00%
    • 201: ~0.30%
    • 202: ~0.30%
    • 203: ~0.30%
    • 204: ~0.90%
  • Samples:
    sentence label
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
    科目:コンクリート。名称:免震基礎天端グラウト注入。 0
  • Loss: sentence_transformer_lib.custom_batch_all_trip_loss.CustomBatchAllTripletLoss

Training Hyperparameters

Non-Default Hyperparameters

  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • num_train_epochs: 250
  • warmup_ratio: 0.2
  • fp16: True
  • batch_sampler: group_by_label

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: no
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 1e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 250
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.2
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: group_by_label
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
2.16 50 0.0584
4.32 100 0.0591
6.48 150 0.0675
8.64 200 0.0637
10.8 250 0.0637
13.04 300 0.0647
15.2 350 0.0656
17.36 400 0.0578
19.52 450 0.0585
21.68 500 0.0546
23.84 550 0.0523
26.08 600 0.0563
28.24 650 0.0526
30.4 700 0.0532
32.56 750 0.0546
34.72 800 0.0483
36.88 850 0.0566
39.12 900 0.0482
41.28 950 0.0508
43.44 1000 0.05
45.6 1050 0.0471
47.76 1100 0.0502
49.92 1150 0.0477
52.16 1200 0.0429
54.32 1250 0.0415
56.48 1300 0.0433
58.64 1350 0.0489
60.8 1400 0.0494
63.04 1450 0.0412
65.2 1500 0.0447
67.36 1550 0.0379
69.52 1600 0.0401
71.68 1650 0.0449
73.84 1700 0.0377
76.08 1750 0.0375
78.24 1800 0.0394
80.4 1850 0.0392
82.56 1900 0.0404
84.72 1950 0.0392
86.88 2000 0.0427
89.12 2050 0.0357
91.28 2100 0.0339
93.44 2150 0.0443
95.6 2200 0.0405
97.76 2250 0.0362
99.92 2300 0.0323
102.16 2350 0.0335
104.32 2400 0.0408
106.48 2450 0.034
108.64 2500 0.0383
110.8 2550 0.0299
113.04 2600 0.0306
115.2 2650 0.0351
117.36 2700 0.0322
119.52 2750 0.041
121.68 2800 0.0292
123.84 2850 0.027
126.08 2900 0.0323
128.24 2950 0.0355
130.4 3000 0.0366
132.56 3050 0.0312
134.72 3100 0.0279
136.88 3150 0.0306
139.12 3200 0.0245
141.28 3250 0.0325
143.44 3300 0.0356
145.6 3350 0.0362
147.76 3400 0.0287
149.92 3450 0.0339
1.6389 50 0.0386
3.5278 100 0.0366
5.4167 150 0.0364
7.3056 200 0.0394
9.1944 250 0.0387
11.0833 300 0.0407
12.7222 350 0.0392
14.6111 400 0.0395
16.5 450 0.0393
18.3889 500 0.0361
20.2778 550 0.0347
22.1667 600 0.0346
24.0556 650 0.0371
25.6944 700 0.0411
27.5833 750 0.0329
29.4722 800 0.0337
31.3611 850 0.0325
33.25 900 0.034
35.1389 950 0.0352
37.0278 1000 0.0305
38.6667 1050 0.0311
40.5556 1100 0.0314
42.4444 1150 0.0307
44.3333 1200 0.0324
46.2222 1250 0.0355
48.1111 1300 0.0306
49.75 1350 0.027
51.6389 1400 0.0282
53.5278 1450 0.0318
55.4167 1500 0.0314
57.3056 1550 0.0323
59.1944 1600 0.0286
61.0833 1650 0.0338
62.7222 1700 0.0287
64.6111 1750 0.0309
66.5 1800 0.0287
68.3889 1850 0.028
70.2778 1900 0.026
72.1667 1950 0.0269
74.0556 2000 0.0295
75.6944 2050 0.0257
77.5833 2100 0.0261
79.4722 2150 0.0304
81.3611 2200 0.0265
83.25 2250 0.0274
85.1389 2300 0.0276
87.0278 2350 0.0325
88.6667 2400 0.0233
90.5556 2450 0.0212
92.4444 2500 0.0243
94.3333 2550 0.0288
96.2222 2600 0.026
98.1111 2650 0.029
99.75 2700 0.0228
101.6389 2750 0.0265
103.5278 2800 0.017
105.4167 2850 0.026
107.3056 2900 0.0257
109.1944 2950 0.0237
111.0833 3000 0.0261
112.7222 3050 0.0204
114.6111 3100 0.0186
116.5 3150 0.0206
118.3889 3200 0.0233
120.2778 3250 0.0235
122.1667 3300 0.0232
124.0556 3350 0.0194
125.6944 3400 0.0242
127.5833 3450 0.0234
129.4722 3500 0.023
131.3611 3550 0.0187
133.25 3600 0.0208
135.1389 3650 0.0201
137.0278 3700 0.024
138.6667 3750 0.0255
140.5556 3800 0.0201
142.4444 3850 0.0231
144.3333 3900 0.0199
146.2222 3950 0.018
148.1111 4000 0.0228
149.75 4050 0.0204
151.6389 4100 0.025
153.5278 4150 0.0163
155.4167 4200 0.0157
157.3056 4250 0.0189
159.1944 4300 0.0176
161.0833 4350 0.03
162.7222 4400 0.0197
164.6111 4450 0.0207
166.5 4500 0.0189
168.3889 4550 0.0132
170.2778 4600 0.0178
172.1667 4650 0.0216
174.0556 4700 0.0174
175.6944 4750 0.0229
177.5833 4800 0.0181
179.4722 4850 0.0161
181.3611 4900 0.0236
183.25 4950 0.0185
185.1389 5000 0.02
187.0278 5050 0.0147
188.6667 5100 0.0203
190.5556 5150 0.0159
192.4444 5200 0.0133
194.3333 5250 0.0192
196.2222 5300 0.0162
198.1111 5350 0.0183
199.75 5400 0.015
201.6389 5450 0.0145
203.5278 5500 0.017
205.4167 5550 0.0219
207.3056 5600 0.0195
209.1944 5650 0.0186
211.0833 5700 0.0142
212.7222 5750 0.0191
214.6111 5800 0.0167
216.5 5850 0.013
218.3889 5900 0.0154
220.2778 5950 0.0135
222.1667 6000 0.0139
224.0556 6050 0.0203
225.6944 6100 0.0169
227.5833 6150 0.0146
229.4722 6200 0.0206
231.3611 6250 0.0149
233.25 6300 0.014
235.1389 6350 0.0174
237.0278 6400 0.0191
238.6667 6450 0.0137
240.5556 6500 0.0125
242.4444 6550 0.0081
244.3333 6600 0.0145
246.2222 6650 0.0116
248.1111 6700 0.0154
249.75 6750 0.0179

Framework Versions

  • Python: 3.11.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.51.3
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.6.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

CustomBatchAllTripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
11
Safetensors
Model size
111M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using Detomo/cl-nagoya-sup-simcse-ja-nss-v_1_0_6 1