SentenceTransformer based on ashercn97/medicalai_ClinicalBERT-2025-04-11_22-11-59

This is a sentence-transformers model finetuned from ashercn97/medicalai_ClinicalBERT-2025-04-11_22-11-59. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: DistilBertModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': False})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the ๐Ÿค— Hub
model = SentenceTransformer("ashercn97/medical-v003")
# Run inference
sentences = [
    'description: Bronchiectasis',
    'description: Bronchiectasis, uncomplicated',
    'description: Acute on chronic systolic (congestive) heart failure',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 389,269 training samples
  • Columns: primary_code and description
  • Approximate statistics based on the first 1000 samples:
    primary_code description
    type string string
    details
    • min: 5 tokens
    • mean: 7.63 tokens
    • max: 27 tokens
    • min: 6 tokens
    • mean: 16.73 tokens
    • max: 69 tokens
  • Samples:
    primary_code description
    code: 137120 description: RADIAL HEAD MOD 10X22MM
    description: LVEF 50-55% description: Unspecified systolic (congestive) heart failure
    code: 510347 description: MAG-AL UD (MAALOX)
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 10,000 evaluation samples
  • Columns: primary_code and description
  • Approximate statistics based on the first 1000 samples:
    primary_code description
    type string string
    details
    • min: 5 tokens
    • mean: 7.67 tokens
    • max: 37 tokens
    • min: 5 tokens
    • mean: 16.22 tokens
    • max: 64 tokens
  • Samples:
    primary_code description
    description: Psoriasis description: Psoriasis, unspecified
    description: Hodgkin Lymphoma description: Hodgkin lymphoma, unspecified, unspecified site
    description: Cancer-related pain control plan description: Neoplasm related pain (acute) (chronic)
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • num_train_epochs: 2
  • lr_scheduler_type: cosine
  • warmup_ratio: 0.1
  • seed: 12
  • bf16: True
  • dataloader_num_workers: 64
  • dataloader_prefetch_factor: 5
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 512
  • per_device_eval_batch_size: 512
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0.01
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 2
  • max_steps: -1
  • lr_scheduler_type: cosine
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 12
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: True
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 64
  • dataloader_prefetch_factor: 5
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • tp_size: 0
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss
0.0013 1 5.8248 -
0.0066 5 5.7392 -
0.0131 10 5.7616 -
0.0197 15 5.771 -
0.0263 20 5.738 -
0.0329 25 5.6972 -
0.0394 30 5.6486 -
0.0460 35 5.4818 -
0.0526 40 5.3395 -
0.0591 45 5.3319 -
0.0657 50 5.0993 1.5206
0.0723 55 5.0328 -
0.0788 60 4.9303 -
0.0854 65 4.8829 -
0.0920 70 4.8534 -
0.0986 75 4.7204 -
0.1051 80 4.6473 -
0.1117 85 4.5718 -
0.1183 90 4.5464 -
0.1248 95 4.5003 -
0.1314 100 4.4006 1.2175
0.1380 105 4.3973 -
0.1445 110 4.3876 -
0.1511 115 4.2815 -
0.1577 120 4.2261 -
0.1643 125 4.2256 -
0.1708 130 4.0866 -
0.1774 135 4.1415 -
0.1840 140 4.0636 -
0.1905 145 3.993 -
0.1971 150 3.9825 1.0376
0.2037 155 3.9345 -
0.2102 160 3.8686 -
0.2168 165 3.8343 -
0.2234 170 3.8011 -
0.2300 175 3.8103 -
0.2365 180 3.7799 -
0.2431 185 3.7414 -
0.2497 190 3.7447 -
0.2562 195 3.7346 -
0.2628 200 3.622 0.9137
0.2694 205 3.6555 -
0.2760 210 3.5778 -
0.2825 215 3.6234 -
0.2891 220 3.4653 -
0.2957 225 3.5705 -
0.3022 230 3.6318 -
0.3088 235 3.5244 -
0.3154 240 3.4487 -
0.3219 245 3.4906 -
0.3285 250 3.5459 0.8556
0.3351 255 3.3821 -
0.3417 260 3.4249 -
0.3482 265 3.4054 -
0.3548 270 3.4558 -
0.3614 275 3.3719 -
0.3679 280 3.2999 -
0.3745 285 3.3562 -
0.3811 290 3.3306 -
0.3876 295 3.2987 -
0.3942 300 3.2789 0.8102
0.4008 305 3.3221 -
0.4074 310 3.259 -
0.4139 315 3.2014 -
0.4205 320 3.1932 -
0.4271 325 3.2654 -
0.4336 330 3.1644 -
0.4402 335 3.2603 -
0.4468 340 3.2053 -
0.4534 345 3.1934 -
0.4599 350 3.138 0.7800
0.4665 355 3.108 -
0.4731 360 3.1663 -
0.4796 365 3.0978 -
0.4862 370 3.0882 -
0.4928 375 3.0992 -
0.4993 380 3.1188 -
0.5059 385 3.0937 -
0.5125 390 3.0411 -
0.5191 395 3.0851 -
0.5256 400 2.9981 0.7582
0.5322 405 3.0407 -
0.5388 410 2.9823 -
0.5453 415 3.0702 -
0.5519 420 3.0528 -
0.5585 425 3.0542 -
0.5650 430 3.0114 -
0.5716 435 2.9981 -
0.5782 440 2.9551 -
0.5848 445 2.9857 -
0.5913 450 2.9816 0.7337
0.5979 455 2.9808 -
0.6045 460 3.001 -
0.6110 465 2.9569 -
0.6176 470 2.9685 -
0.6242 475 2.8984 -
0.6307 480 2.8961 -
0.6373 485 2.9701 -
0.6439 490 2.8576 -
0.6505 495 2.9435 -
0.6570 500 2.9025 0.7270
0.6636 505 2.9408 -
0.6702 510 2.9115 -
0.6767 515 2.8296 -
0.6833 520 2.8089 -
0.6899 525 2.8953 -
0.6965 530 2.878 -
0.7030 535 2.8488 -
0.7096 540 2.8499 -
0.7162 545 2.7698 -
0.7227 550 2.8673 0.7193
0.7293 555 2.8058 -
0.7359 560 2.8479 -
0.7424 565 2.7514 -
0.7490 570 2.8213 -
0.7556 575 2.8438 -
0.7622 580 2.7368 -
0.7687 585 2.7612 -
0.7753 590 2.8911 -
0.7819 595 2.7759 -
0.7884 600 2.7618 0.6923
0.7950 605 2.7429 -
0.8016 610 2.7693 -
0.8081 615 2.7278 -
0.8147 620 2.8094 -
0.8213 625 2.7303 -
0.8279 630 2.7333 -
0.8344 635 2.6704 -
0.8410 640 2.75 -
0.8476 645 2.7469 -
0.8541 650 2.7348 0.6816
0.8607 655 2.7615 -
0.8673 660 2.7722 -
0.8739 665 2.765 -
0.8804 670 2.7235 -
0.8870 675 2.668 -
0.8936 680 2.7102 -
0.9001 685 2.7256 -
0.9067 690 2.7451 -
0.9133 695 2.1618 -
0.9198 700 1.3555 0.6804
0.9264 705 1.493 -
0.9330 710 1.3587 -
0.9396 715 1.3546 -
0.9461 720 1.3266 -
0.9527 725 1.3071 -
0.9593 730 1.2159 -
0.9658 735 1.376 -
0.9724 740 1.2715 -
0.9790 745 1.4462 -
0.9855 750 1.3423 0.6624
0.9921 755 1.3689 -
0.9987 760 1.3903 -
1.0053 765 2.43 -
1.0118 770 2.6936 -
1.0184 775 2.6122 -
1.0250 780 2.6665 -
1.0315 785 2.5816 -
1.0381 790 2.6004 -
1.0447 795 2.5618 -
1.0512 800 2.5187 0.6604
1.0578 805 2.559 -
1.0644 810 2.6416 -
1.0710 815 2.5599 -
1.0775 820 2.5993 -
1.0841 825 2.6176 -
1.0907 830 2.6315 -
1.0972 835 2.5305 -
1.1038 840 2.5624 -
1.1104 845 2.5767 -
1.1170 850 2.5543 0.6536
1.1235 855 2.5607 -
1.1301 860 2.5992 -
1.1367 865 2.6229 -
1.1432 870 2.597 -
1.1498 875 2.6013 -
1.1564 880 2.5763 -
1.1629 885 2.6565 -
1.1695 890 2.5783 -
1.1761 895 2.5474 -
1.1827 900 2.5754 0.6460
1.1892 905 2.5905 -
1.1958 910 2.6075 -
1.2024 915 2.5284 -
1.2089 920 2.6113 -
1.2155 925 2.5301 -
1.2221 930 2.5992 -
1.2286 935 2.5951 -
1.2352 940 2.5554 -
1.2418 945 2.5287 -
1.2484 950 2.4902 0.6411
1.2549 955 2.5829 -
1.2615 960 2.4933 -
1.2681 965 2.5032 -
1.2746 970 2.579 -
1.2812 975 2.5702 -
1.2878 980 2.5115 -
1.2943 985 2.5074 -
1.3009 990 2.5588 -
1.3075 995 2.4964 -
1.3141 1000 2.4969 0.6405
1.3206 1005 2.5437 -
1.3272 1010 2.5002 -
1.3338 1015 2.5195 -
1.3403 1020 2.5596 -
1.3469 1025 2.4809 -
1.3535 1030 2.5545 -
1.3601 1035 2.5403 -
1.3666 1040 2.538 -
1.3732 1045 2.5768 -
1.3798 1050 2.5246 0.6392
1.3863 1055 2.5714 -
1.3929 1060 2.4998 -
1.3995 1065 2.4409 -
1.4060 1070 2.4343 -
1.4126 1075 2.4988 -
1.4192 1080 2.519 -
1.4258 1085 2.5475 -
1.4323 1090 2.5481 -
1.4389 1095 2.5262 -
1.4455 1100 2.5288 0.6356
1.4520 1105 2.4489 -
1.4586 1110 2.5134 -
1.4652 1115 2.5466 -
1.4717 1120 2.5953 -
1.4783 1125 2.5048 -
1.4849 1130 2.5482 -
1.4915 1135 2.5035 -
1.4980 1140 2.4865 -
1.5046 1145 2.436 -
1.5112 1150 2.5097 0.6339
1.5177 1155 2.4402 -
1.5243 1160 2.5121 -
1.5309 1165 2.5289 -
1.5375 1170 2.4334 -
1.5440 1175 2.5176 -
1.5506 1180 2.4507 -
1.5572 1185 2.5162 -
1.5637 1190 2.4426 -
1.5703 1195 2.4526 -
1.5769 1200 2.4578 0.6315
1.5834 1205 2.4775 -
1.5900 1210 2.4659 -
1.5966 1215 2.4884 -
1.6032 1220 2.4713 -
1.6097 1225 2.4861 -
1.6163 1230 2.4817 -
1.6229 1235 2.4861 -
1.6294 1240 2.4207 -
1.6360 1245 2.5191 -
1.6426 1250 2.5891 0.6282
1.6491 1255 2.4916 -
1.6557 1260 2.4456 -
1.6623 1265 2.4901 -
1.6689 1270 2.5061 -
1.6754 1275 2.5172 -
1.6820 1280 2.4396 -
1.6886 1285 2.5093 -
1.6951 1290 2.4524 -
1.7017 1295 2.4564 -
1.7083 1300 2.48 0.6263
1.7148 1305 2.4826 -
1.7214 1310 2.4376 -
1.7280 1315 2.4966 -
1.7346 1320 2.4468 -
1.7411 1325 2.5125 -
1.7477 1330 2.401 -
1.7543 1335 2.5318 -
1.7608 1340 2.4687 -
1.7674 1345 2.5803 -
1.7740 1350 2.4707 0.6253
1.7806 1355 2.4686 -
1.7871 1360 2.4372 -
1.7937 1365 2.4549 -
1.8003 1370 2.4697 -
1.8068 1375 2.4849 -
1.8134 1380 2.3773 -
1.8200 1385 2.4402 -
1.8265 1390 2.4962 -
1.8331 1395 2.4085 -
1.8397 1400 2.5318 0.6247
1.8463 1405 2.5119 -
1.8528 1410 2.5209 -
1.8594 1415 2.4548 -
1.8660 1420 2.4803 -
1.8725 1425 2.4829 -
1.8791 1430 2.4629 -
1.8857 1435 2.5106 -
1.8922 1440 2.4612 -
1.8988 1445 2.5666 -
1.9054 1450 2.4677 0.6243
1.9120 1455 2.2826 -
1.9185 1460 1.2653 -
1.9251 1465 1.1973 -
1.9317 1470 1.2686 -
1.9382 1475 1.3213 -
1.9448 1480 1.1828 -
1.9514 1485 1.3756 -
1.9580 1490 1.276 -
1.9645 1495 1.1679 -
1.9711 1500 1.1197 0.6244
1.9777 1505 1.3336 -
1.9842 1510 1.2969 -
1.9908 1515 1.1702 -
1.9974 1520 1.0661 -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.0.2
  • Transformers: 4.51.2
  • PyTorch: 2.6.0
  • Accelerate: 1.6.0
  • Datasets: 3.5.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
2
Safetensors
Model size
135M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ashercn97/medical-v003