Redis semantic caching CrossEncoder model fine-tuned on Quora Question Pairs

This is a Cross Encoder model finetuned from cross-encoder/ms-marco-MiniLM-L6-v2 on the Quora Question Pairs LangCache Train Set dataset using the sentence-transformers library. It computes scores for pairs of texts, which can be used for sentence pair classification.

Model Details

Model Description

  • Model Type: Cross Encoder
  • Base model: cross-encoder/ms-marco-MiniLM-L6-v2
  • Maximum Sequence Length: 512 tokens
  • Number of Output Labels: 1 label
  • Training Dataset:
    • Quora Question Pairs LangCache Train Set
  • Language: en
  • License: apache-2.0

Model Sources

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import CrossEncoder

# Download from the ๐Ÿค— Hub
model = CrossEncoder("aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2")
# Get scores for pairs of texts
pairs = [
    ['How can I get a list of my Gmail accounts?', 'How can I find all my old Gmail accounts?'],
    ['How can I stop Quora from modifying and editing other peopleโ€™s questions on Quora?', 'Can I prevent a Quora user from editing my question on Quora?'],
    ['How much does it cost to design a logo in india?', 'How much does it cost to design a logo?'],
    ['What is screenedrenters.com?', 'What is allmyapps.com?'],
    ['What are the best colleges for an MBA in Australia?', 'What are the top MBA schools in Australia?'],
]
scores = model.predict(pairs)
print(scores.shape)
# (5,)

# Or rank different texts based on similarity to a single text
ranks = model.rank(
    'How can I get a list of my Gmail accounts?',
    [
        'How can I find all my old Gmail accounts?',
        'Can I prevent a Quora user from editing my question on Quora?',
        'How much does it cost to design a logo?',
        'What is allmyapps.com?',
        'What are the top MBA schools in Australia?',
    ]
)
# [{'corpus_id': ..., 'score': ...}, {'corpus_id': ..., 'score': ...}, ...]

Evaluation

Metrics

Cross Encoder Classification

Metric Value
accuracy 0.6956
accuracy_threshold 4.1688
f1 0.5947
f1_threshold 3.3412
precision 0.4834
recall 0.7727
average_precision 0.6229

Training Details

Training Dataset

Quora Question Pairs LangCache Train Set

  • Dataset: Quora Question Pairs LangCache Train Set
  • Size: 363,861 training samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 15 characters
    • mean: 60.22 characters
    • max: 229 characters
    • min: 14 characters
    • mean: 60.0 characters
    • max: 274 characters
    • 0: ~63.50%
    • 1: ~36.50%
  • Samples:
    sentence1 sentence2 label
    Why do people believe in God and how can they say he/she exists? Why do we kill each other in the name of God? 0
    What are the chances of a bee sting when a bee buzzes around you? How can I tell if my bees are agitated/likely to sting? 0
    If a man from Syro Malankara church marries a Syro-Malabar girl, can they join a Syro-Malabar parish? Is Malabar Hills of Mumbai anyhow related to Malabar of Kerala? 0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Evaluation Dataset

Quora Question Pairs LangCache Validation Set

  • Dataset: Quora Question Pairs LangCache Validation Set
  • Size: 40,429 evaluation samples
  • Columns: sentence1, sentence2, and label
  • Approximate statistics based on the first 1000 samples:
    sentence1 sentence2 label
    type string string int
    details
    • min: 13 characters
    • mean: 59.91 characters
    • max: 266 characters
    • min: 13 characters
    • mean: 59.51 characters
    • max: 293 characters
    • 0: ~63.80%
    • 1: ~36.20%
  • Samples:
    sentence1 sentence2 label
    How can I get a list of my Gmail accounts? How can I find all my old Gmail accounts? 1
    How can I stop Quora from modifying and editing other peopleโ€™s questions on Quora? Can I prevent a Quora user from editing my question on Quora? 1
    How much does it cost to design a logo in india? How much does it cost to design a logo? 0
  • Loss: BinaryCrossEntropyLoss with these parameters:
    {
        "activation_fn": "torch.nn.modules.linear.Identity",
        "pos_weight": null
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • learning_rate: 0.0002
  • num_train_epochs: 15
  • load_best_model_at_end: True
  • push_to_hub: True
  • hub_model_id: aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 64
  • per_device_eval_batch_size: 64
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 0.0002
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 15
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: True
  • resume_from_checkpoint: None
  • hub_model_id: aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: proportional

Training Logs

Click to expand
Epoch Step Training Loss Validation Loss quora-eval_average_precision
0.0879 500 0.3913 0.3302 0.5603
0.1759 1000 0.3408 0.3220 0.5932
0.2638 1500 0.3318 0.3249 0.6144
0.3517 2000 0.3235 0.3027 0.6280
0.4397 2500 0.3173 0.2944 0.6233
0.5276 3000 0.3049 0.3009 0.6685
0.6155 3500 0.3071 0.2908 0.6221
0.7035 4000 0.3015 0.2854 0.6143
0.7914 4500 0.2944 0.2759 0.6361
0.8794 5000 0.2984 0.2854 0.6616
0.9673 5500 0.2898 0.3002 0.6109
1.0552 6000 0.2552 0.2800 0.6466
1.1432 6500 0.2352 0.2821 0.6305
1.2311 7000 0.2366 0.2778 0.5699
1.3190 7500 0.2332 0.2831 0.6076
1.4070 8000 0.2366 0.2783 0.6003
1.4949 8500 0.2391 0.2716 0.6195
1.5828 9000 0.241 0.2685 0.6229
1.6708 9500 0.2359 0.2804 0.6410
1.7587 10000 0.2374 0.2819 0.6448
1.8466 10500 0.2387 0.2750 0.6479
1.9346 11000 0.2343 0.2734 0.6034
2.0225 11500 0.2193 0.3168 0.6384
2.1104 12000 0.1741 0.3011 0.6189
2.1984 12500 0.1732 0.2988 0.6412
2.2863 13000 0.1814 0.2839 0.6156
2.3743 13500 0.1815 0.2930 0.5520
2.4622 14000 0.1774 0.3461 0.6195
2.5501 14500 0.1886 0.3033 0.6113
2.6381 15000 0.1831 0.2925 0.5815
2.7260 15500 0.1889 0.2801 0.5701
2.8139 16000 0.1869 0.2893 0.6090
2.9019 16500 0.1896 0.3038 0.6142
2.9898 17000 0.1967 0.2791 0.5967
3.0777 17500 0.1395 0.3119 0.5672
3.1657 18000 0.1392 0.3052 0.5876
3.2536 18500 0.1411 0.3030 0.6064
3.3415 19000 0.1356 0.3064 0.5535
3.4295 19500 0.14 0.3144 0.5978
3.5174 20000 0.1461 0.3332 0.5961
3.6053 20500 0.1468 0.3179 0.5975
3.6933 21000 0.1487 0.3327 0.5932
3.7812 21500 0.1479 0.3340 0.5888
3.8692 22000 0.1458 0.3172 0.5478
3.9571 22500 0.1566 0.3036 0.5926
4.0450 23000 0.1257 0.3552 0.5941
4.1330 23500 0.1004 0.3886 0.5067
4.2209 24000 0.1061 0.3682 0.5654
4.3088 24500 0.1087 0.3212 0.5556
4.3968 25000 0.11 0.3348 0.5628
4.4847 25500 0.1108 0.3740 0.5046
4.5726 26000 0.1169 0.3092 0.5882
4.6606 26500 0.1156 0.3498 0.4988
4.7485 27000 0.1232 0.3042 0.5801
4.8364 27500 0.1195 0.3685 0.5793
4.9244 28000 0.122 0.3199 0.5383
5.0123 28500 0.1151 0.4291 0.5510
5.1002 29000 0.0815 0.4297 0.4973
5.1882 29500 0.086 0.4798 0.4969
5.2761 30000 0.0892 0.4475 0.5230
5.3641 30500 0.0888 0.4165 0.4267
5.4520 31000 0.0929 0.4398 0.4674
5.5399 31500 0.0929 0.4551 0.4629
5.6279 32000 0.0928 0.3756 0.4537
5.7158 32500 0.0961 0.4014 0.5037
5.8037 33000 0.0924 0.3953 0.5158
5.8917 33500 0.0988 0.3890 0.5355
5.9796 34000 0.0963 0.3823 0.5130
6.0675 34500 0.0738 0.4251 0.4924
6.1555 35000 0.0681 0.4444 0.4891
6.2434 35500 0.0703 0.4472 0.4994
6.3313 36000 0.071 0.4552 0.4920
6.4193 36500 0.0706 0.4149 0.4726
6.5072 37000 0.0751 0.3840 0.4771
6.5951 37500 0.0708 0.4455 0.5152
6.6831 38000 0.0775 0.4124 0.4290
6.7710 38500 0.0766 0.4004 0.4459
6.8590 39000 0.0811 0.4209 0.4192
6.9469 39500 0.0766 0.4294 0.4805
7.0348 40000 0.07 0.4470 0.4623
7.1228 40500 0.05 0.5520 0.4211
7.2107 41000 0.0555 0.4425 0.3890
7.2986 41500 0.057 0.5324 0.4204
7.3866 42000 0.06 0.4664 0.4517
7.4745 42500 0.0583 0.4506 0.4966
7.5624 43000 0.0582 0.4441 0.4659
7.6504 43500 0.0615 0.4528 0.4495
7.7383 44000 0.0614 0.4744 0.4350
7.8262 44500 0.0605 0.4272 0.4630
7.9142 45000 0.0625 0.4709 0.4414
8.0021 45500 0.065 0.4513 0.4060
8.0900 46000 0.0412 0.6073 0.3839
8.1780 46500 0.0431 0.5060 0.3656
8.2659 47000 0.0425 0.5438 0.4042
8.3539 47500 0.0462 0.5835 0.4171
8.4418 48000 0.0475 0.5035 0.4144
8.5297 48500 0.0476 0.5046 0.4105
8.6177 49000 0.0483 0.5080 0.4071
8.7056 49500 0.0487 0.5682 0.4130
8.7935 50000 0.049 0.5026 0.4283
8.8815 50500 0.0517 0.4920 0.3529
8.9694 51000 0.0495 0.4956 0.4038
9.0573 51500 0.0378 0.5368 0.3654
9.1453 52000 0.0328 0.4895 0.3775
9.2332 52500 0.0337 0.5245 0.4051
9.3211 53000 0.0361 0.5925 0.3984
9.4091 53500 0.0369 0.5197 0.4134
9.4970 54000 0.0388 0.5246 0.4186
9.5849 54500 0.0364 0.5243 0.4245
9.6729 55000 0.0373 0.5164 0.4119
9.7608 55500 0.0358 0.6019 0.4171
9.8488 56000 0.0364 0.6166 0.4050
9.9367 56500 0.0406 0.5238 0.4329
10.0246 57000 0.0361 0.6156 0.4138
10.1126 57500 0.0267 0.5612 0.4073
10.2005 58000 0.023 0.6370 0.4049
10.2884 58500 0.0293 0.5876 0.4069
10.3764 59000 0.0255 0.6200 0.4239
10.4643 59500 0.0282 0.5882 0.4085
10.5522 60000 0.0307 0.5499 0.4084
10.6402 60500 0.0294 0.6012 0.3956
10.7281 61000 0.0283 0.6330 0.4027
10.8160 61500 0.0323 0.5620 0.4037
10.9040 62000 0.0305 0.6073 0.4067
10.9919 62500 0.0284 0.5969 0.4048
11.0798 63000 0.0194 0.6831 0.4041
11.1678 63500 0.0209 0.6346 0.3937
11.2557 64000 0.0183 0.6610 0.3691
11.3437 64500 0.0221 0.6509 0.3755
11.4316 65000 0.0217 0.7004 0.4256
11.5195 65500 0.0239 0.5978 0.4087
11.6075 66000 0.0234 0.6237 0.3687
11.6954 66500 0.0222 0.5774 0.4177
11.7833 67000 0.0234 0.6203 0.4368
11.8713 67500 0.0216 0.5981 0.4396
11.9592 68000 0.0235 0.5636 0.4338
12.0471 68500 0.0193 0.6815 0.4295
12.1351 69000 0.0154 0.6883 0.4516
12.2230 69500 0.0153 0.7075 0.4128
12.3109 70000 0.0155 0.6650 0.4300
12.3989 70500 0.0147 0.7161 0.4029
12.4868 71000 0.015 0.7274 0.4082
12.5747 71500 0.0172 0.6526 0.3834
12.6627 72000 0.0156 0.6420 0.3574
12.7506 72500 0.0158 0.6716 0.3905
12.8386 73000 0.0165 0.6757 0.3805
12.9265 73500 0.0144 0.6964 0.3932
13.0144 74000 0.0133 0.7359 0.3913
13.1024 74500 0.0137 0.7126 0.4071
13.1903 75000 0.0118 0.7234 0.4115
13.2782 75500 0.0117 0.7391 0.4225
13.3662 76000 0.0123 0.7435 0.3931
13.4541 76500 0.0121 0.7334 0.4033
13.5420 77000 0.0114 0.7370 0.3965
13.6300 77500 0.0107 0.7646 0.4340
13.7179 78000 0.0123 0.7255 0.4015
13.8058 78500 0.0129 0.6944 0.3901
13.8938 79000 0.0097 0.7561 0.4181
13.9817 79500 0.0121 0.7178 0.3991
14.0696 80000 0.0087 0.7505 0.3858
14.1576 80500 0.0071 0.7765 0.3827
14.2455 81000 0.0082 0.7851 0.3812
14.3335 81500 0.0094 0.7683 0.3877
14.4214 82000 0.0076 0.7705 0.3938
14.5093 82500 0.0071 0.7653 0.3916
14.5973 83000 0.0092 0.7557 0.3851
14.6852 83500 0.0058 0.7718 0.3889
14.7731 84000 0.0069 0.7753 0.3895
14.8611 84500 0.0083 0.7706 0.3902
14.9490 85000 0.0075 0.7741 0.3909
-1 -1 - - 0.6229
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.0
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}
Downloads last month
346
Safetensors
Model size
22.7M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for aditeyabaral-redis/langcache-crossencoder-v1-ms-marco-MiniLM-L6-v2

Evaluation results