--- language: - en license: apache-2.0 tags: - sentence-transformers - sentence-similarity - feature-extraction - generated_from_trainer - dataset_size:73 - loss:MatryoshkaLoss - loss:MultipleNegativesRankingLoss base_model: thenlper/gte-base widget: - source_sentence: What is the maximum value of equipment that can be purchased with a CUE Student Research Project Grant? sentences: - Equipment costs (valued up to $1000). - Variable awards to recognize and reward academic achievement at the senior high school level and to encourage students to pursue post -secondary studies. - The Amazon Future Engineer Scholarship provides students with an opportunity to upgrade their careers with a $7,500 CAD/year scholarship available for up to four years. - source_sentence: What is the minimum distance a recipient's hometown must be from Concordia University of Edmonton to be eligible for the Alberta Blue Cross Away from Home Scholarship? sentences: - Three awards are available - The recipient’s hometown must be at least 100 kilometres from Concordia University of Edmonton. - 'Application Deadline: September 1' - source_sentence: According to the selection criteria, what level of subjects are used to determine the academic standing of a potential Alberta Blue Cross Away from Home Scholarship recipient? sentences: - Selection is ba sed on the academic standing of 30 -level subjects used for admission. - 'These eligible and ineligible lists are not exhaustive. Doubts about the eligibility of expenses should be directed to the ORI’s Research Administration Service s (RAS): ras@concordia.ab.ca .' - '*Value: $11000 Master’s; $14,000 Doctoral' - source_sentence: According to the text, how many days does a grant recipient have to submit a final report after the grant ends? sentences: - All Fall grant recipients are expected to submit an abstract to present an oral and/or poster presentation of their work, either in its progression or final stage. - a business program offered by an Alberta college, polytechnic, or university that offers the prerequisite courses required for entrance into the CPA Professional Education Program (CPA PEP). - The applicant is required to complete and submit a final report within 5 days of the end of the grant. - source_sentence: In what format should applicants acknowledge the funding provided by Concordia University of Edmonton for their Student Project Grant? sentences: - All oral or poster presentations, publications, including public messages, arising from research supported by CUE grants must acknowledge the support of the institution. Acknowledgement can be in the written format, such as " This research is funded by the generous support of Concordia University of Edmonton through their CUE Student Research Project Grants program ", or similar phrasing. - This $1,000 scholarship is awarded to post -secondary students who have completed at least one year towards their Bachelor of Science with a focus on Computer Science, achieved an average GPA of 3.5 or higher, and are still enrolled in post -secondary studie s. - The recipient will be selected based on the highest grade in MARK320. In the event of a tie, preference will be given to the student with the highest cumulative GPA. pipeline_tag: sentence-similarity library_name: sentence-transformers metrics: - cosine_accuracy@1 - cosine_accuracy@3 - cosine_accuracy@5 - cosine_accuracy@10 - cosine_precision@1 - cosine_precision@3 - cosine_precision@5 - cosine_precision@10 - cosine_recall@1 - cosine_recall@3 - cosine_recall@5 - cosine_recall@10 - cosine_ndcg@10 - cosine_mrr@10 - cosine_map@100 model-index: - name: BGE base Financial Matryoshka results: - task: type: information-retrieval name: Information Retrieval dataset: name: dim 768 type: dim_768 metrics: - type: cosine_accuracy@1 value: 0.5555555555555556 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 1.0 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 1.0 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 1.0 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.5555555555555556 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.3333333333333333 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.2 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.1 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.5555555555555556 name: Cosine Recall@1 - type: cosine_recall@3 value: 1.0 name: Cosine Recall@3 - type: cosine_recall@5 value: 1.0 name: Cosine Recall@5 - type: cosine_recall@10 value: 1.0 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.8214210289682637 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.7592592592592592 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.7592592592592592 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 512 type: dim_512 metrics: - type: cosine_accuracy@1 value: 0.4444444444444444 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.8888888888888888 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 1.0 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 1.0 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.4444444444444444 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.2962962962962963 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.2 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.1 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.4444444444444444 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.8888888888888888 name: Cosine Recall@3 - type: cosine_recall@5 value: 1.0 name: Cosine Recall@5 - type: cosine_recall@10 value: 1.0 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.7678413135022636 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.6888888888888889 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.6888888888888889 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 256 type: dim_256 metrics: - type: cosine_accuracy@1 value: 0.4444444444444444 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 1.0 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 1.0 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 1.0 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.4444444444444444 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.3333333333333333 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.2 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.1 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.4444444444444444 name: Cosine Recall@1 - type: cosine_recall@3 value: 1.0 name: Cosine Recall@3 - type: cosine_recall@5 value: 1.0 name: Cosine Recall@5 - type: cosine_recall@10 value: 1.0 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.7658654734127082 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.6851851851851851 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.6851851851851851 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 128 type: dim_128 metrics: - type: cosine_accuracy@1 value: 0.4444444444444444 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.8888888888888888 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.8888888888888888 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.8888888888888888 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.4444444444444444 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.2962962962962963 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.17777777777777778 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.08888888888888889 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.4444444444444444 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.8888888888888888 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.8888888888888888 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.8888888888888888 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.7103099178571526 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.6481481481481483 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.6521164021164021 name: Cosine Map@100 - task: type: information-retrieval name: Information Retrieval dataset: name: dim 64 type: dim_64 metrics: - type: cosine_accuracy@1 value: 0.6666666666666666 name: Cosine Accuracy@1 - type: cosine_accuracy@3 value: 0.6666666666666666 name: Cosine Accuracy@3 - type: cosine_accuracy@5 value: 0.7777777777777778 name: Cosine Accuracy@5 - type: cosine_accuracy@10 value: 0.8888888888888888 name: Cosine Accuracy@10 - type: cosine_precision@1 value: 0.6666666666666666 name: Cosine Precision@1 - type: cosine_precision@3 value: 0.2222222222222222 name: Cosine Precision@3 - type: cosine_precision@5 value: 0.15555555555555556 name: Cosine Precision@5 - type: cosine_precision@10 value: 0.08888888888888889 name: Cosine Precision@10 - type: cosine_recall@1 value: 0.6666666666666666 name: Cosine Recall@1 - type: cosine_recall@3 value: 0.6666666666666666 name: Cosine Recall@3 - type: cosine_recall@5 value: 0.7777777777777778 name: Cosine Recall@5 - type: cosine_recall@10 value: 0.8888888888888888 name: Cosine Recall@10 - type: cosine_ndcg@10 value: 0.7515566546007473 name: Cosine Ndcg@10 - type: cosine_mrr@10 value: 0.7103174603174602 name: Cosine Mrr@10 - type: cosine_map@100 value: 0.71494708994709 name: Cosine Map@100 --- # BGE base Financial Matryoshka This is a [sentence-transformers](https://www.SBERT.net) model finetuned from [thenlper/gte-base](https://huggingface.co/thenlper/gte-base) on the json dataset. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more. ## Model Details ### Model Description - **Model Type:** Sentence Transformer - **Base model:** [thenlper/gte-base](https://huggingface.co/thenlper/gte-base) - **Maximum Sequence Length:** 512 tokens - **Output Dimensionality:** 768 dimensions - **Similarity Function:** Cosine Similarity - **Training Dataset:** - json - **Language:** en - **License:** apache-2.0 ### Model Sources - **Documentation:** [Sentence Transformers Documentation](https://sbert.net) - **Repository:** [Sentence Transformers on GitHub](https://github.com/UKPLab/sentence-transformers) - **Hugging Face:** [Sentence Transformers on Hugging Face](https://huggingface.co/models?library=sentence-transformers) ### Full Model Architecture ``` SentenceTransformer( (0): Transformer({'max_seq_length': 512, 'do_lower_case': False}) with Transformer model: BertModel (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True}) (2): Normalize() ) ``` ## Usage ### Direct Usage (Sentence Transformers) First install the Sentence Transformers library: ```bash pip install -U sentence-transformers ``` Then you can load this model and run inference. ```python from sentence_transformers import SentenceTransformer # Download from the 🤗 Hub model = SentenceTransformer("TatvaRA/gte-base-finetuned-schorlaships-matryonshka") # Run inference sentences = [ 'In what format should applicants acknowledge the funding provided by Concordia University of Edmonton for their Student Project Grant?', 'All oral or poster presentations, publications, including public messages, arising from research supported by CUE grants must acknowledge the support of the institution. Acknowledgement can be in the written format, such as " This research is funded by the generous support of Concordia University of Edmonton through their CUE Student Research Project Grants program ", or similar phrasing.', 'The recipient will be selected based on the highest grade in MARK320. In the event of a tie, preference will be given to the student with the highest cumulative GPA.', ] embeddings = model.encode(sentences) print(embeddings.shape) # [3, 768] # Get the similarity scores for the embeddings similarities = model.similarity(embeddings, embeddings) print(similarities.shape) # [3, 3] ``` ## Evaluation ### Metrics #### Information Retrieval * Dataset: `dim_768` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters: ```json { "truncate_dim": 768 } ``` | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.5556 | | cosine_accuracy@3 | 1.0 | | cosine_accuracy@5 | 1.0 | | cosine_accuracy@10 | 1.0 | | cosine_precision@1 | 0.5556 | | cosine_precision@3 | 0.3333 | | cosine_precision@5 | 0.2 | | cosine_precision@10 | 0.1 | | cosine_recall@1 | 0.5556 | | cosine_recall@3 | 1.0 | | cosine_recall@5 | 1.0 | | cosine_recall@10 | 1.0 | | **cosine_ndcg@10** | **0.8214** | | cosine_mrr@10 | 0.7593 | | cosine_map@100 | 0.7593 | #### Information Retrieval * Dataset: `dim_512` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters: ```json { "truncate_dim": 512 } ``` | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.4444 | | cosine_accuracy@3 | 0.8889 | | cosine_accuracy@5 | 1.0 | | cosine_accuracy@10 | 1.0 | | cosine_precision@1 | 0.4444 | | cosine_precision@3 | 0.2963 | | cosine_precision@5 | 0.2 | | cosine_precision@10 | 0.1 | | cosine_recall@1 | 0.4444 | | cosine_recall@3 | 0.8889 | | cosine_recall@5 | 1.0 | | cosine_recall@10 | 1.0 | | **cosine_ndcg@10** | **0.7678** | | cosine_mrr@10 | 0.6889 | | cosine_map@100 | 0.6889 | #### Information Retrieval * Dataset: `dim_256` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters: ```json { "truncate_dim": 256 } ``` | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.4444 | | cosine_accuracy@3 | 1.0 | | cosine_accuracy@5 | 1.0 | | cosine_accuracy@10 | 1.0 | | cosine_precision@1 | 0.4444 | | cosine_precision@3 | 0.3333 | | cosine_precision@5 | 0.2 | | cosine_precision@10 | 0.1 | | cosine_recall@1 | 0.4444 | | cosine_recall@3 | 1.0 | | cosine_recall@5 | 1.0 | | cosine_recall@10 | 1.0 | | **cosine_ndcg@10** | **0.7659** | | cosine_mrr@10 | 0.6852 | | cosine_map@100 | 0.6852 | #### Information Retrieval * Dataset: `dim_128` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters: ```json { "truncate_dim": 128 } ``` | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.4444 | | cosine_accuracy@3 | 0.8889 | | cosine_accuracy@5 | 0.8889 | | cosine_accuracy@10 | 0.8889 | | cosine_precision@1 | 0.4444 | | cosine_precision@3 | 0.2963 | | cosine_precision@5 | 0.1778 | | cosine_precision@10 | 0.0889 | | cosine_recall@1 | 0.4444 | | cosine_recall@3 | 0.8889 | | cosine_recall@5 | 0.8889 | | cosine_recall@10 | 0.8889 | | **cosine_ndcg@10** | **0.7103** | | cosine_mrr@10 | 0.6481 | | cosine_map@100 | 0.6521 | #### Information Retrieval * Dataset: `dim_64` * Evaluated with [InformationRetrievalEvaluator](https://sbert.net/docs/package_reference/sentence_transformer/evaluation.html#sentence_transformers.evaluation.InformationRetrievalEvaluator) with these parameters: ```json { "truncate_dim": 64 } ``` | Metric | Value | |:--------------------|:-----------| | cosine_accuracy@1 | 0.6667 | | cosine_accuracy@3 | 0.6667 | | cosine_accuracy@5 | 0.7778 | | cosine_accuracy@10 | 0.8889 | | cosine_precision@1 | 0.6667 | | cosine_precision@3 | 0.2222 | | cosine_precision@5 | 0.1556 | | cosine_precision@10 | 0.0889 | | cosine_recall@1 | 0.6667 | | cosine_recall@3 | 0.6667 | | cosine_recall@5 | 0.7778 | | cosine_recall@10 | 0.8889 | | **cosine_ndcg@10** | **0.7516** | | cosine_mrr@10 | 0.7103 | | cosine_map@100 | 0.7149 | ## Training Details ### Training Dataset #### json * Dataset: json * Size: 73 training samples * Columns: anchor and positive * Approximate statistics based on the first 73 samples: | | anchor | positive | |:--------|:----------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------| | type | string | string | | details | | | * Samples: | anchor | positive | |:----------------------------------------------------------------------------------------------------------------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | What specific type of students are the Alberta Innovates Graduate Student Scholarships designed to support? | The Alberta Innovates Graduate Student Scholarships support academically superior graduate students
who are receiving training and conducting research in areas that are strategically important to Alberta’s
economy.
| | What is the specific date by which students must submit their reports for the Spring 2025 grant period? | Report due date April 20th (5 days post grant closure) | | In what format should applicants acknowledge the funding provided by Concordia University of Edmonton for their Student Project Grant? | All oral or poster presentations, publications, including public messages, arising from research supported by CUE grants must acknowledge the support of the institution. Acknowledgement can be in the written format, such as " This research is funded by the generous support of Concordia University of Edmonton through their CUE Student Research Project Grants program ", or similar phrasing. | * Loss: [MatryoshkaLoss](https://sbert.net/docs/package_reference/sentence_transformer/losses.html#matryoshkaloss) with these parameters: ```json { "loss": "MultipleNegativesRankingLoss", "matryoshka_dims": [ 768, 512, 256, 128, 64 ], "matryoshka_weights": [ 1, 1, 1, 1, 1 ], "n_dims_per_step": -1 } ``` ### Training Hyperparameters #### Non-Default Hyperparameters - `eval_strategy`: epoch - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 16 - `gradient_accumulation_steps`: 16 - `learning_rate`: 2e-05 - `num_train_epochs`: 4 - `lr_scheduler_type`: cosine - `warmup_ratio`: 0.1 - `fp16`: True - `load_best_model_at_end`: True - `optim`: adamw_torch_fused - `batch_sampler`: no_duplicates #### All Hyperparameters
Click to expand - `overwrite_output_dir`: False - `do_predict`: False - `eval_strategy`: epoch - `prediction_loss_only`: True - `per_device_train_batch_size`: 32 - `per_device_eval_batch_size`: 16 - `per_gpu_train_batch_size`: None - `per_gpu_eval_batch_size`: None - `gradient_accumulation_steps`: 16 - `eval_accumulation_steps`: None - `learning_rate`: 2e-05 - `weight_decay`: 0.0 - `adam_beta1`: 0.9 - `adam_beta2`: 0.999 - `adam_epsilon`: 1e-08 - `max_grad_norm`: 1.0 - `num_train_epochs`: 4 - `max_steps`: -1 - `lr_scheduler_type`: cosine - `lr_scheduler_kwargs`: {} - `warmup_ratio`: 0.1 - `warmup_steps`: 0 - `log_level`: passive - `log_level_replica`: warning - `log_on_each_node`: True - `logging_nan_inf_filter`: True - `save_safetensors`: True - `save_on_each_node`: False - `save_only_model`: False - `restore_callback_states_from_checkpoint`: False - `no_cuda`: False - `use_cpu`: False - `use_mps_device`: False - `seed`: 42 - `data_seed`: None - `jit_mode_eval`: False - `use_ipex`: False - `bf16`: False - `fp16`: True - `fp16_opt_level`: O1 - `half_precision_backend`: auto - `bf16_full_eval`: False - `fp16_full_eval`: False - `tf32`: None - `local_rank`: 0 - `ddp_backend`: None - `tpu_num_cores`: None - `tpu_metrics_debug`: False - `debug`: [] - `dataloader_drop_last`: False - `dataloader_num_workers`: 0 - `dataloader_prefetch_factor`: None - `past_index`: -1 - `disable_tqdm`: False - `remove_unused_columns`: True - `label_names`: None - `load_best_model_at_end`: True - `ignore_data_skip`: False - `fsdp`: [] - `fsdp_min_num_params`: 0 - `fsdp_config`: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False} - `fsdp_transformer_layer_cls_to_wrap`: None - `accelerator_config`: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None} - `deepspeed`: None - `label_smoothing_factor`: 0.0 - `optim`: adamw_torch_fused - `optim_args`: None - `adafactor`: False - `group_by_length`: False - `length_column_name`: length - `ddp_find_unused_parameters`: None - `ddp_bucket_cap_mb`: None - `ddp_broadcast_buffers`: False - `dataloader_pin_memory`: True - `dataloader_persistent_workers`: False - `skip_memory_metrics`: True - `use_legacy_prediction_loop`: False - `push_to_hub`: False - `resume_from_checkpoint`: None - `hub_model_id`: None - `hub_strategy`: every_save - `hub_private_repo`: False - `hub_always_push`: False - `gradient_checkpointing`: False - `gradient_checkpointing_kwargs`: None - `include_inputs_for_metrics`: False - `eval_do_concat_batches`: True - `fp16_backend`: auto - `push_to_hub_model_id`: None - `push_to_hub_organization`: None - `mp_parameters`: - `auto_find_batch_size`: False - `full_determinism`: False - `torchdynamo`: None - `ray_scope`: last - `ddp_timeout`: 1800 - `torch_compile`: False - `torch_compile_backend`: None - `torch_compile_mode`: None - `dispatch_batches`: None - `split_batches`: None - `include_tokens_per_second`: False - `include_num_input_tokens_seen`: False - `neftune_noise_alpha`: None - `optim_target_modules`: None - `batch_eval_metrics`: False - `prompts`: None - `batch_sampler`: no_duplicates - `multi_dataset_batch_sampler`: proportional
### Training Logs | Epoch | Step | dim_768_cosine_ndcg@10 | dim_512_cosine_ndcg@10 | dim_256_cosine_ndcg@10 | dim_128_cosine_ndcg@10 | dim_64_cosine_ndcg@10 | |:-------:|:-----:|:----------------------:|:----------------------:|:----------------------:|:----------------------:|:---------------------:| | 1.0 | 1 | 0.7249 | 0.7249 | 0.7473 | 0.7026 | 0.6686 | | 2.0 | 2 | 0.7619 | 0.7249 | 0.7533 | 0.7026 | 0.7480 | | **3.0** | **3** | **0.7804** | **0.7619** | **0.7659** | **0.7103** | **0.7496** | | 4.0 | 4 | 0.8214 | 0.7678 | 0.7659 | 0.7103 | 0.7516 | * The bold row denotes the saved checkpoint. ### Framework Versions - Python: 3.11.12 - Sentence Transformers: 4.1.0 - Transformers: 4.41.2 - PyTorch: 2.1.2+cu121 - Accelerate: 1.5.2 - Datasets: 2.19.1 - Tokenizers: 0.19.1 ## Citation ### BibTeX #### Sentence Transformers ```bibtex @inproceedings{reimers-2019-sentence-bert, title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks", author = "Reimers, Nils and Gurevych, Iryna", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing", month = "11", year = "2019", publisher = "Association for Computational Linguistics", url = "https://arxiv.org/abs/1908.10084", } ``` #### MatryoshkaLoss ```bibtex @misc{kusupati2024matryoshka, title={Matryoshka Representation Learning}, author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi}, year={2024}, eprint={2205.13147}, archivePrefix={arXiv}, primaryClass={cs.LG} } ``` #### MultipleNegativesRankingLoss ```bibtex @misc{henderson2017efficient, title={Efficient Natural Language Response Suggestion for Smart Reply}, author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil}, year={2017}, eprint={1705.00652}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```