SentenceTransformer based on BAAI/bge-large-en-v1.5
This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: BAAI/bge-large-en-v1.5
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': True, 'architecture': 'BertModel'})
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dhruvnayee/help_texted_mined_r3s_0810")
# Run inference
sentences = [
'What inputs does Qx accept?',
'## Examples q\ue04f\ue052 | =Qx(58, , LT) q\ue028\ue04f\ue053\ue029\ue02a\ue027 | =Qx(59+1, 1, LT)',
"Invalid Target Variable Message: '<name>' cannot be an array because it is the target input range.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[1.0000, 0.8317, 0.8164],
# [0.8317, 1.0000, 0.4416],
# [0.8164, 0.4416, 1.0000]])
Training Details
Training Dataset
Unnamed Dataset
- Size: 16,688 training samples
- Columns:
sentence_0
,sentence_1
, andsentence_2
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 7 tokens
- mean: 13.85 tokens
- max: 26 tokens
- min: 3 tokens
- mean: 123.19 tokens
- max: 384 tokens
- min: 3 tokens
- mean: 118.89 tokens
- max: 384 tokens
- Samples:
sentence_0 sentence_1 sentence_2 What are the new features in this release?
## What's new from previous upgrades
What's new in version 1.2 Targeting * This functionality may be used to find the value of an input variable giving the specified value of an output variable, for example for use in profit testing Enhancements to distributed processing * This provides for greater functionality in distributed processing Enhancements to Compare * This extends the functionality of the Compare Page to allow comparison of multiple components Initialization variables * This new component allows greater flexibility and simplicity of coding variables, allowing different definitions at the outset of a projection and during the projection Stochastic processes (not available in R³S Modeler Lite ) * This involves enhancements to the existing stochastic process functionality Results workspaces * This enhances the information provided about related results workspaces on the opening page of a workspace Program Linker and Model Builder * This enables greater ease of adding and moving items within these tools Analyzer e...
What does this element represent?
Data_Source_Name The Data_Source_Name system variable is a character variable that gives the name of the data source. You can use this system variable in a data source in a data process in the data layer of a model. This system variable is a placeholder variable.
Cannot rerun results workspace Message: It is not possible to rerun a results workspace that contained any model that failed to build.
How does R3S Modeler create parent program records?
Record_Is_Last_Step The Record_Is_Last_Step system variable is an indicator variable that is 1 if the Record_End_Date system variable for the current program record is a date that is in the current projection step and 0 otherwise. This system variable is not defined for parent program records that R³S Modeler creates solely by aggregating child program records (and does not read from data).
## Circumstances This error occurs when the variable used in a formula does not exist in the workspace (for example, it is not in the Variable Chooser ). For example, if a variable, say, CF_Premium is defined as Prem_Annual * Prob_Surr and the variable Prob_Surr does not exist within the workspace then the above error will occur.
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
num_train_epochs
: 1fp16
: Truemulti_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: noprediction_loss_only
: Trueper_device_train_batch_size
: 8per_device_eval_batch_size
: 8per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 1max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Nonedispatch_batches
: Nonesplit_batches
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robinrouter_mapping
: {}learning_rate_mapping
: {}
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.2397 | 500 | 4.8389 |
0.4794 | 1000 | 4.7385 |
0.7191 | 1500 | 4.7068 |
0.9588 | 2000 | 4.7199 |
Framework Versions
- Python: 3.11.11
- Sentence Transformers: 5.1.1
- Transformers: 4.49.0
- PyTorch: 2.5.1+cu124
- Accelerate: 1.3.0
- Datasets: 3.2.0
- Tokenizers: 0.21.0
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 9
Model tree for dhruvnayee/help_texted_mined_r3s_0810
Base model
BAAI/bge-large-en-v1.5