--- language: - en license: apache-2.0 tags: - biencoder - sentence-transformers - text-classification - sentence-pair-classification - semantic-similarity - semantic-search - retrieval - reranking - generated_from_trainer - dataset_size:1451941 - loss:MultipleNegativesRankingLoss base_model: answerdotai/ModernBERT-base widget: - source_sentence: Gocharya ji authored Krishna Cahrit Manas in the poetic form describing about the full life of Lord Krishna ( from birth to Nirvana ) . sentences: - 'Q: Can I buy coverage for prescription drugs right away?' - Krishna Cahrit Manas in poetic form , describing the full life of Lord Krishna ( from birth to nirvana ) , wrote Gocharya ji . - Baron played actress Violet Carson who portrayed Ena Sharples in the soap . - source_sentence: The Kilkenny line only reached Maryborough in 1867 . sentences: - It was also known formerly as ' Crotto ' . - The line from Maryborough only reached Kilkenny in 1867 . - The line from Kilkenny only reached Maryborough in 1867 . - source_sentence: Tokelau International Netball Team represents Tokelau in the national netball . sentences: - Ernest Dewey Albinson ( 1898 in Minneapolis , Minnesota - 1971 in Mexico ) was an American artist . - The Tokelau national netball team represents Tokelau in international netball . - The Tokelau international netball team represents Tokelau in national netball . - source_sentence: The real number is called the `` imaginary part `` of the real number ; the real number is called the `` complex part `` of . sentences: - The school board consists of Robbie Sanders , Bryan Richards , Linda Fullingim , Lori Lambert , & Kelly Teague . - Which web design company has the best templates? - The real number is called the `` imaginary part `` of the real number , the real number of `` complex part `` of . - source_sentence: All For You was the third and last single of Kate Ryan 's third album `` Alive `` . sentences: - According to John Keay , he was `` country bred `` ( born and educated in India ) . - All For You was the third single of the third and last album `` Alive `` by Kate Ryan . - All For You was the third and last single of the third album of Kate Ryan `` Alive `` . datasets: - redis/langcache-sentencepairs-v1 pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - cosine_accuracy@1 - cosine_precision@1 - cosine_recall@1 - cosine_ndcg@10 - cosine_mrr@1 - cosine_map@100 model-index: - name: Redis fine-tuned BiEncoder model for semantic caching on LangCache results: - task: type: information-retrieval name: Information Retrieval dataset: name: train type: train metrics: - type: cosine_accuracy@1 value: 0.37778739987010174 name: Cosine Accuracy@1 - type: cosine_precision@1 value: 0.37778739987010174 name: Cosine Precision@1 - type: cosine_recall@1 value: 0.36103963757730806 name: Cosine Recall@1 - type: cosine_ndcg@10 value: 0.5622280163193171 name: Cosine Ndcg@10 - type: cosine_mrr@1 value: 0.37778739987010174 name: Cosine Mrr@1 - type: cosine_map@100 value: 0.5081953861443469 name: Cosine Map@100 --- # Redis fine-tuned BiEncoder model for semantic caching on LangCache This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for sentence pair similarity. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) - **Maximum Sequence Length:** 100 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity - **Training Dataset:** - [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) - **Language:** en - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 100, 'do_lower_case': False, 'architecture': 'ModernBertModel'}) (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("redis/langcache-embed-v3") # Run inference sentences = [ "All For You was the third and last single of Kate Ryan 's third album `` Alive `` .", 'All For You was the third and last single of the third album of Kate Ryan `` Alive `` .', 'All For You was the third single of the third and last album `` Alive `` by Kate Ryan .', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities) # tensor([[0.9961, 0.9922, 0.9922], # [0.9922, 1.0000, 0.9961], # [0.9922, 0.9961, 1.0000]], dtype=torch.bfloat16) ``` ## Evaluation ### Metrics #### Information Retrieval * Dataset: `train` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) | Metric | Value | |:-------------------|:-----------| | cosine_accuracy@1 | 0.3778 | | cosine_precision@1 | 0.3778 | | cosine_recall@1 | 0.361 | | **cosine_ndcg@10** | **0.5622** | | cosine_mrr@1 | 0.3778 | | cosine_map@100 | 0.5082 | ## Training Details ### Training Dataset #### LangCache Sentence Pairs (all) * Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) * Size: 109,885 training samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | The newer Punts are still very much in existence today and race in the same fleets as the older boats . | The newer punts are still very much in existence today and run in the same fleets as the older boats . | how can I get financial freedom as soon as possible? | | The newer punts are still very much in existence today and run in the same fleets as the older boats . | The newer Punts are still very much in existence today and race in the same fleets as the older boats . | The older Punts are still very much in existence today and race in the same fleets as the newer boats . | | Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . | Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada . | Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false } ``` ### Evaluation Dataset #### LangCache Sentence Pairs (all) * Dataset: [LangCache Sentence Pairs (all)](https://huggingface.co/datasets/redis/langcache-sentencepairs-v1) * Size: 109,885 evaluation samples * Columns: anchor, positive, and negative * Approximate statistics based on the first 1000 samples: | | anchor | positive | negative | |:--------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------|:----------------------------------------------------------------------------------| | type | string | string | string | | details | | | | * Samples: | anchor | positive | negative | |:--------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------|:----------------------------------------------------------------------------------------------------------------------------------------------| | The newer Punts are still very much in existence today and race in the same fleets as the older boats . | The newer punts are still very much in existence today and run in the same fleets as the older boats . | how can I get financial freedom as soon as possible? | | The newer punts are still very much in existence today and run in the same fleets as the older boats . | The newer Punts are still very much in existence today and race in the same fleets as the older boats . | The older Punts are still very much in existence today and race in the same fleets as the newer boats . | | Turner Valley , was at the Turner Valley Bar N Ranch Airport , southwest of the Turner Valley Bar N Ranch , Alberta , Canada . | Turner Valley , , was located at Turner Valley Bar N Ranch Airport , southwest of Turner Valley Bar N Ranch , Alberta , Canada . | Turner Valley Bar N Ranch Airport , , was located at Turner Valley Bar N Ranch , southwest of Turner Valley , Alberta , Canada . | * Loss: [MultipleNegativesRankingLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#multiplenegativesrankingloss) with these parameters: ```json { "scale": 20.0, "similarity_fct": "cos_sim", "gather_across_devices": false } ``` ### Training Logs | Epoch | Step | train_cosine_ndcg@10 | |:-----:|:----:|:--------------------:| | -1 | -1 | 0.5622 | ### Framework Versions - Python: 3.12.3 - Sentence Transformers: 5.1.0 - Transformers: 4.56.0 - PyTorch: 2.8.0+cu128 - Accelerate: 1.10.1 - Datasets: 4.0.0 - Tokenizers: 0.22.0 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```