SentenceTransformer based on codersan/FaMiniLM

This is a sentence-transformers model finetuned from codersan/FaMiniLM. It maps sentences & paragraphs to a 384-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: codersan/FaMiniLM
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 384 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("codersan/FaMiniLm_Mizan3")
# Run inference
sentences = [
    'اگر این کار مداومت می\u200cیافت، سنگر قادر به مقاومت نمی\u200cبود.',
    'If this were continued, the barricade was no longer tenable.',
    'Well, for this moment she had a protector.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 384]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,021,596 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 46.68 tokens
    • max: 212 tokens
    • min: 3 tokens
    • mean: 16.07 tokens
    • max: 81 tokens
  • Samples:
    anchor positive
    دختران برای اطاعت امر پدر از جا برخاستند. They arose to obey.
    همه چیز را بم وقع خواهی دانست. You'll know it all in time
    او هر لحظه گرفتار یک‌ وضع است، زارزار گریه می‌کند. می‌گوید به ما توهین کرده‌اند، حیثیتمان را لکه‌دار نمودند. She is in hysterics up there, and moans and says that we have been 'shamed and disgraced.
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • learning_rate: 2e-05
  • num_train_epochs: 1
  • warmup_ratio: 0.1
  • load_best_model_at_end: True
  • push_to_hub: True
  • hub_model_id: codersan/FaMiniLm_Mizan3
  • eval_on_start: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 2e-05
  • weight_decay: 0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 1
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: codersan/FaMiniLm_Mizan3
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: True
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss
0 0 -
0.0016 100 3.1518
0.0031 200 3.1015
0.0047 300 2.9207
0.0063 400 2.8322
0.0078 500 2.7199
0.0094 600 2.6413
0.0110 700 2.4895
0.0125 800 2.4221
0.0141 900 2.2712
0.0157 1000 2.1497
0.0172 1100 2.0346
0.0188 1200 1.9132
0.0204 1300 1.848
0.0219 1400 1.7412
0.0235 1500 1.6231
0.0251 1600 1.5678
0.0266 1700 1.4954
0.0282 1800 1.4429
0.0298 1900 1.4179
0.0313 2000 1.3837
0.0329 2100 1.3612
0.0345 2200 1.3025
0.0360 2300 1.2768
0.0376 2400 1.2126
0.0392 2500 1.1951
0.0407 2600 1.1558
0.0423 2700 1.1002
0.0439 2800 1.1269
0.0454 2900 1.0932
0.0470 3000 1.0697
0.0486 3100 1.0455
0.0501 3200 1.0405
0.0517 3300 0.9895
0.0532 3400 0.9983
0.0548 3500 0.9381
0.0564 3600 0.9618
0.0579 3700 0.9799
0.0595 3800 0.8866
0.0611 3900 0.9085
0.0626 4000 0.9123
0.0642 4100 0.9017
0.0658 4200 0.8789
0.0673 4300 0.8164
0.0689 4400 0.8131
0.0705 4500 0.7834
0.0720 4600 0.7814
0.0736 4700 0.7927
0.0752 4800 0.8416
0.0767 4900 0.73
0.0783 5000 0.753
0.0799 5100 0.7397
0.0814 5200 0.7242
0.0830 5300 0.734
0.0846 5400 0.7379
0.0861 5500 0.7255
0.0877 5600 0.7621
0.0893 5700 0.6825
0.0908 5800 0.7056
0.0924 5900 0.6877
0.0940 6000 0.6865
0.0955 6100 0.6652
0.0971 6200 0.6445
0.0987 6300 0.6548
0.1002 6400 0.6556
0.1018 6500 0.6544
0.1034 6600 0.6496
0.1049 6700 0.6158
0.1065 6800 0.6693
0.1081 6900 0.6179
0.1096 7000 0.5527
0.1112 7100 0.596
0.1128 7200 0.5625
0.1143 7300 0.592
0.1159 7400 0.6063
0.1175 7500 0.5163
0.1190 7600 0.5472
0.1206 7700 0.5849
0.1222 7800 0.5948
0.1237 7900 0.5245
0.1253 8000 0.5561
0.1269 8100 0.5175
0.1284 8200 0.4929
0.1300 8300 0.5158
0.1316 8400 0.5429
0.1331 8500 0.5324
0.1347 8600 0.511
0.1363 8700 0.5242
0.1378 8800 0.5202
0.1394 8900 0.4967
0.1410 9000 0.5466
0.1425 9100 0.4865
0.1441 9200 0.5172
0.1457 9300 0.51
0.1472 9400 0.5204
0.1488 9500 0.4851
0.1504 9600 0.4726
0.1519 9700 0.4608
0.1535 9800 0.453
0.1551 9900 0.4539
0.1566 10000 0.442
0.1582 10100 0.4632
0.1597 10200 0.4024
0.1613 10300 0.4516
0.1629 10400 0.4551
0.1644 10500 0.4598
0.1660 10600 0.4791
0.1676 10700 0.4295
0.1691 10800 0.4552
0.1707 10900 0.4548
0.1723 11000 0.4795
0.1738 11100 0.4694
0.1754 11200 0.4049
0.1770 11300 0.4473
0.1785 11400 0.4161
0.1801 11500 0.4106
0.1817 11600 0.4276
0.1832 11700 0.416
0.1848 11800 0.4184
0.1864 11900 0.4268
0.1879 12000 0.4169
0.1895 12100 0.4063
0.1911 12200 0.4257
0.1926 12300 0.4114
0.1942 12400 0.3921
0.1958 12500 0.4037
0.1973 12600 0.4642
0.1989 12700 0.3929
0.2005 12800 0.4059
0.2020 12900 0.4132
0.2036 13000 0.4101
0.2052 13100 0.4122
0.2067 13200 0.3954
0.2083 13300 0.3671
0.2099 13400 0.4257
0.2114 13500 0.3719
0.2130 13600 0.3603
0.2146 13700 0.3465
0.2161 13800 0.3726
0.2177 13900 0.4021
0.2193 14000 0.3706
0.2208 14100 0.3471
0.2224 14200 0.3848
0.2240 14300 0.3967
0.2255 14400 0.3985
0.2271 14500 0.3457
0.2287 14600 0.3438
0.2302 14700 0.3333
0.2318 14800 0.3525
0.2334 14900 0.3948
0.2349 15000 0.3657
0.2365 15100 0.3437
0.2381 15200 0.361
0.2396 15300 0.356
0.2412 15400 0.3572
0.2428 15500 0.3464
0.2443 15600 0.3885
0.2459 15700 0.3324
0.2475 15800 0.3553
0.2490 15900 0.3201
0.2506 16000 0.4078
0.2522 16100 0.3919
0.2537 16200 0.3505
0.2553 16300 0.3423
0.2569 16400 0.3018
0.2584 16500 0.3392
0.2600 16600 0.3128
0.2616 16700 0.3542
0.2631 16800 0.3639
0.2647 16900 0.3765
0.2662 17000 0.3405
0.2678 17100 0.326
0.2694 17200 0.3591
0.2709 17300 0.3087
0.2725 17400 0.3336
0.2741 17500 0.2889
0.2756 17600 0.3341
0.2772 17700 0.3468
0.2788 17800 0.3033
0.2803 17900 0.3482
0.2819 18000 0.3649
0.2835 18100 0.3134
0.2850 18200 0.3264
0.2866 18300 0.3127
0.2882 18400 0.3483
0.2897 18500 0.349
0.2913 18600 0.2957
0.2929 18700 0.3443
0.2944 18800 0.2884
0.2960 18900 0.34
0.2976 19000 0.2875
0.2991 19100 0.3322
0.3007 19200 0.3438
0.3023 19300 0.3188
0.3038 19400 0.3315
0.3054 19500 0.3018
0.3070 19600 0.331
0.3085 19700 0.34
0.3101 19800 0.2819
0.3117 19900 0.3218
0.3132 20000 0.3026
0.3148 20100 0.3341
0.3164 20200 0.285
0.3179 20300 0.3076
0.3195 20400 0.3262
0.3211 20500 0.3225
0.3226 20600 0.293
0.3242 20700 0.3187
0.3258 20800 0.3255
0.3273 20900 0.2978
0.3289 21000 0.2946
0.3305 21100 0.2887
0.3320 21200 0.3098
0.3336 21300 0.2942
0.3352 21400 0.3134
0.3367 21500 0.267
0.3383 21600 0.2907
0.3399 21700 0.2919
0.3414 21800 0.2985
0.3430 21900 0.2815
0.3446 22000 0.2785
0.3461 22100 0.2932
0.3477 22200 0.2599
0.3493 22300 0.2697
0.3508 22400 0.3206
0.3524 22500 0.2874
0.3540 22600 0.2947
0.3555 22700 0.2863
0.3571 22800 0.2906
0.3587 22900 0.3155
0.3602 23000 0.304
0.3618 23100 0.2769
0.3634 23200 0.3024
0.3649 23300 0.2877
0.3665 23400 0.2907
0.3681 23500 0.2813
0.3696 23600 0.3059
0.3712 23700 0.3004
0.3727 23800 0.261
0.3743 23900 0.2952
0.3759 24000 0.2687
0.3774 24100 0.2645
0.3790 24200 0.323
0.3806 24300 0.2982
0.3821 24400 0.2797
0.3837 24500 0.2661
0.3853 24600 0.251
0.3868 24700 0.2991
0.3884 24800 0.2634
0.3900 24900 0.2716
0.3915 25000 0.2902
0.3931 25100 0.276
0.3947 25200 0.2695
0.3962 25300 0.2415
0.3978 25400 0.2694
0.3994 25500 0.2604
0.4009 25600 0.2966
0.4025 25700 0.2798
0.4041 25800 0.2354
0.4056 25900 0.3068
0.4072 26000 0.2434
0.4088 26100 0.24
0.4103 26200 0.2888
0.4119 26300 0.2525
0.4135 26400 0.2632
0.4150 26500 0.2643
0.4166 26600 0.2585
0.4182 26700 0.236
0.4197 26800 0.2796
0.4213 26900 0.2658
0.4229 27000 0.241
0.4244 27100 0.2764
0.4260 27200 0.2534
0.4276 27300 0.2572
0.4291 27400 0.2513
0.4307 27500 0.2254
0.4323 27600 0.2734
0.4338 27700 0.2459
0.4354 27800 0.2202
0.4370 27900 0.2583
0.4385 28000 0.2741
0.4401 28100 0.2329
0.4417 28200 0.2262
0.4432 28300 0.2573
0.4448 28400 0.2559
0.4464 28500 0.3188
0.4479 28600 0.2431
0.4495 28700 0.275
0.4511 28800 0.25
0.4526 28900 0.2721
0.4542 29000 0.2401
0.4558 29100 0.2435
0.4573 29200 0.2703
0.4589 29300 0.2266
0.4605 29400 0.263
0.4620 29500 0.242
0.4636 29600 0.2844
0.4652 29700 0.2317
0.4667 29800 0.2768
0.4683 29900 0.2496
0.4699 30000 0.2377
0.4714 30100 0.2813
0.4730 30200 0.2175
0.4745 30300 0.2502
0.4761 30400 0.2591
0.4777 30500 0.2547
0.4792 30600 0.2521
0.4808 30700 0.263
0.4824 30800 0.1986
0.4839 30900 0.2437
0.4855 31000 0.2397
0.4871 31100 0.2424
0.4886 31200 0.2785
0.4902 31300 0.2517
0.4918 31400 0.2467
0.4933 31500 0.242
0.4949 31600 0.26
0.4965 31700 0.2345
0.4980 31800 0.2228
0.4996 31900 0.2455
0.5012 32000 0.2505
0.5027 32100 0.2352
0.5043 32200 0.2529
0.5059 32300 0.2537
0.5074 32400 0.2147
0.5090 32500 0.2085
0.5106 32600 0.2472
0.5121 32700 0.2487
0.5137 32800 0.2543
0.5153 32900 0.2519
0.5168 33000 0.2589
0.5184 33100 0.2232
0.5200 33200 0.2148
0.5215 33300 0.2377
0.5231 33400 0.2311
0.5247 33500 0.2153
0.5262 33600 0.2138
0.5278 33700 0.218
0.5294 33800 0.2298
0.5309 33900 0.2663
0.5325 34000 0.2489
0.5341 34100 0.2129
0.5356 34200 0.2298
0.5372 34300 0.2742
0.5388 34400 0.2389
0.5403 34500 0.2232
0.5419 34600 0.1931
0.5435 34700 0.2504
0.5450 34800 0.2349
0.5466 34900 0.22
0.5482 35000 0.249
0.5497 35100 0.2541
0.5513 35200 0.2406
0.5529 35300 0.2168
0.5544 35400 0.2481
0.5560 35500 0.2274
0.5576 35600 0.2168
0.5591 35700 0.2443
0.5607 35800 0.2378
0.5623 35900 0.2364
0.5638 36000 0.2232
0.5654 36100 0.2044
0.5670 36200 0.2153
0.5685 36300 0.2178
0.5701 36400 0.2314
0.5717 36500 0.2448
0.5732 36600 0.2652
0.5748 36700 0.2315
0.5764 36800 0.2071
0.5779 36900 0.2267
0.5795 37000 0.2797
0.5810 37100 0.2053
0.5826 37200 0.2331
0.5842 37300 0.2231
0.5857 37400 0.2135
0.5873 37500 0.2424
0.5889 37600 0.2345
0.5904 37700 0.2111
0.5920 37800 0.2553
0.5936 37900 0.2252
0.5951 38000 0.2033
0.5967 38100 0.2284
0.5983 38200 0.213
0.5998 38300 0.195
0.6014 38400 0.1886
0.6030 38500 0.2192
0.6045 38600 0.2569
0.6061 38700 0.1765
0.6077 38800 0.2127
0.6092 38900 0.2213
0.6108 39000 0.2217
0.6124 39100 0.2163
0.6139 39200 0.2141
0.6155 39300 0.2255
0.6171 39400 0.2326
0.6186 39500 0.2005
0.6202 39600 0.2043
0.6218 39700 0.2122
0.6233 39800 0.2212
0.6249 39900 0.2265
0.6265 40000 0.2259
0.6280 40100 0.2456
0.6296 40200 0.2037
0.6312 40300 0.2082
0.6327 40400 0.2284
0.6343 40500 0.2246
0.6359 40600 0.1884
0.6374 40700 0.1909
0.6390 40800 0.2038
0.6406 40900 0.2249
0.6421 41000 0.2211
0.6437 41100 0.2267
0.6453 41200 0.1926
0.6468 41300 0.1787
0.6484 41400 0.2209
0.6500 41500 0.2091
0.6515 41600 0.2064
0.6531 41700 0.2093
0.6547 41800 0.2413
0.6562 41900 0.2141
0.6578 42000 0.2293
0.6594 42100 0.2084
0.6609 42200 0.2095
0.6625 42300 0.2162
0.6641 42400 0.2188
0.6656 42500 0.1992
0.6672 42600 0.2216
0.6688 42700 0.2338
0.6703 42800 0.1941
0.6719 42900 0.2122
0.6735 43000 0.194
0.6750 43100 0.2413
0.6766 43200 0.232
0.6782 43300 0.2115
0.6797 43400 0.2172
0.6813 43500 0.2122
0.6829 43600 0.2059
0.6844 43700 0.2085
0.6860 43800 0.2045
0.6875 43900 0.1893
0.6891 44000 0.204
0.6907 44100 0.1991
0.6922 44200 0.2342
0.6938 44300 0.1834
0.6954 44400 0.1979
0.6969 44500 0.2302
0.6985 44600 0.2144
0.7001 44700 0.185
0.7016 44800 0.2014
0.7032 44900 0.1772
0.7048 45000 0.1967
0.7063 45100 0.1924
0.7079 45200 0.2114
0.7095 45300 0.2091
0.7110 45400 0.2044
0.7126 45500 0.2246
0.7142 45600 0.2109
0.7157 45700 0.1772
0.7173 45800 0.1988
0.7189 45900 0.2183
0.7204 46000 0.1918
0.7220 46100 0.2332
0.7236 46200 0.2097
0.7251 46300 0.2005
0.7267 46400 0.189
0.7283 46500 0.1993
0.7298 46600 0.2224
0.7314 46700 0.2
0.7330 46800 0.1949
0.7345 46900 0.2061
0.7361 47000 0.211
0.7377 47100 0.2393
0.7392 47200 0.2498
0.7408 47300 0.1811
0.7424 47400 0.1873
0.7439 47500 0.2238
0.7455 47600 0.1918
0.7471 47700 0.1805
0.7486 47800 0.2256
0.7502 47900 0.1901
0.7518 48000 0.2344
0.7533 48100 0.2212
0.7549 48200 0.2089
0.7565 48300 0.2169
0.7580 48400 0.2152
0.7596 48500 0.1831
0.7612 48600 0.1521
0.7627 48700 0.2177
0.7643 48800 0.2035
0.7659 48900 0.1713
0.7674 49000 0.2547
0.7690 49100 0.1802
0.7706 49200 0.1975
0.7721 49300 0.2107
0.7737 49400 0.2078
0.7753 49500 0.1917
0.7768 49600 0.1917
0.7784 49700 0.1948
0.7800 49800 0.1881
0.7815 49900 0.1799
0.7831 50000 0.2184
0.7847 50100 0.2323
0.7862 50200 0.1949
0.7878 50300 0.1908
0.7894 50400 0.182
0.7909 50500 0.1783
0.7925 50600 0.2187
0.7940 50700 0.1711
0.7956 50800 0.2127
0.7972 50900 0.1886
0.7987 51000 0.1825
0.8003 51100 0.206
0.8019 51200 0.2058
0.8034 51300 0.2065
0.8050 51400 0.1857
0.8066 51500 0.1853
0.8081 51600 0.2035
0.8097 51700 0.194
0.8113 51800 0.2157
0.8128 51900 0.1965
0.8144 52000 0.1924
0.8160 52100 0.1995
0.8175 52200 0.2166
0.8191 52300 0.15
0.8207 52400 0.1507
0.8222 52500 0.2096
0.8238 52600 0.205
0.8254 52700 0.207
0.8269 52800 0.1735
0.8285 52900 0.1748
0.8301 53000 0.2401
0.8316 53100 0.1749
0.8332 53200 0.1996
0.8348 53300 0.194
0.8363 53400 0.1856
0.8379 53500 0.1926
0.8395 53600 0.1914
0.8410 53700 0.1988
0.8426 53800 0.1778
0.8442 53900 0.1884
0.8457 54000 0.1965
0.8473 54100 0.2086
0.8489 54200 0.1934
0.8504 54300 0.1789
0.8520 54400 0.1947
0.8536 54500 0.1768
0.8551 54600 0.2194
0.8567 54700 0.1944
0.8583 54800 0.1946
0.8598 54900 0.1998
0.8614 55000 0.1716
0.8630 55100 0.202
0.8645 55200 0.2069
0.8661 55300 0.2221
0.8677 55400 0.1859
0.8692 55500 0.1817
0.8708 55600 0.2091
0.8724 55700 0.1756
0.8739 55800 0.1982
0.8755 55900 0.1947
0.8771 56000 0.1745
0.8786 56100 0.1914
0.8802 56200 0.1867
0.8818 56300 0.1935
0.8833 56400 0.1844
0.8849 56500 0.1704
0.8865 56600 0.2127
0.8880 56700 0.224
0.8896 56800 0.2092
0.8912 56900 0.2042
0.8927 57000 0.1898
0.8943 57100 0.1515
0.8958 57200 0.1952
0.8974 57300 0.17
0.8990 57400 0.1843
0.9005 57500 0.2019
0.9021 57600 0.1724
0.9037 57700 0.1912
0.9052 57800 0.1979
0.9068 57900 0.2014
0.9084 58000 0.2063
0.9099 58100 0.1794
0.9115 58200 0.1972
0.9131 58300 0.1501
0.9146 58400 0.2001
0.9162 58500 0.2082
0.9178 58600 0.2076
0.9193 58700 0.1722
0.9209 58800 0.1954
0.9225 58900 0.1604
0.9240 59000 0.1816
0.9256 59100 0.1809
0.9272 59200 0.1762
0.9287 59300 0.215
0.9303 59400 0.1953
0.9319 59500 0.1865
0.9334 59600 0.208
0.9350 59700 0.2035
0.9366 59800 0.1966
0.9381 59900 0.1777
0.9397 60000 0.2044
0.9413 60100 0.1773
0.9428 60200 0.1843
0.9444 60300 0.1786
0.9460 60400 0.1958
0.9475 60500 0.1959
0.9491 60600 0.2047
0.9507 60700 0.2
0.9522 60800 0.1843
0.9538 60900 0.1946
0.9554 61000 0.1752
0.9569 61100 0.1724
0.9585 61200 0.1701
0.9601 61300 0.1791
0.9616 61400 0.1731
0.9632 61500 0.203
0.9648 61600 0.1985
0.9663 61700 0.1968
0.9679 61800 0.1719
0.9695 61900 0.1608
0.9710 62000 0.1691
0.9726 62100 0.1761
0.9742 62200 0.1805
0.9757 62300 0.1732
0.9773 62400 0.1657
0.9789 62500 0.1757
0.9804 62600 0.157
0.9820 62700 0.1995
0.9836 62800 0.1937
0.9851 62900 0.1839
0.9867 63000 0.194
0.9883 63100 0.1755
0.9898 63200 0.1819
0.9914 63300 0.1918
0.9930 63400 0.1636
0.9945 63500 0.1731
0.9961 63600 0.1671
0.9977 63700 0.1704
0.9992 63800 0.2089
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.3.1
  • Transformers: 4.47.0
  • PyTorch: 2.5.1+cu121
  • Accelerate: 1.2.1
  • Datasets: 3.2.0
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
27
Safetensors
Model size
22.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for codersan/FaMiniLm_Mizan3

Finetuned
codersan/FaMiniLM
Finetuned
(1)
this model