BGE large Financial Matryoshka

This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-large-en-v1.5
Maximum Sequence Length: 512 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity
Language: en
License: apache-2.0

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bnkc123/obscura-v1")
# Run inference
sentences = [
    'Which additional coverage table shows a $1,000 limit for identity theft on a standard policy but $10,000 for Platinum and GrandProtect?',
    "farmers lloyd's insurance company of texas texas residential property manual updated: may, 2020 page 3 section i - additional coverages additional coverages ho-2 homeowners, homeowners, market value, mobile homeowners, renters, condominium and landlord's rental platinum and grandprotect products (includes homeowners, renters and condominium) loss of use additional living expense or fair rental value and loss of rental income increased limits available prohibited use refer to rule 2 yes up to 14 days refer to rule 2 yes for platinum up to 45 days debris removal 10% 10% reasonable repairs yes yes fire department charges $750 $1000 emergency removal of property 30 days 30 days emergency living expense $500 $500 refrigerated contents $1000 $1500 identity theft and credit protection (cov. 9) increased limits available $1000 yes $10,000 no data and records $1500 for personal none for business $2500 lock replacement yes yes reward coverage $5000 $5000 trees, shrubs and plants (coverage 12) increased limits available $500 per item/ 5% aggregate yes $500 per item/ 5% aggregate yes loss assessment (coverage 6) increased limits available $1000 yes $10,000 yes land $10,000 $10,000 volcanic action yes yes collapse yes yes inflation protection yes yes landlord's furnishings $2500 $2500 fungus and mold remediation $5000 $5000 backup of sewer, drain and sump pump (coverage 13) optional $10,000 increased limits available newly acquired watercraft n/a with grandprotect identity fraud n/a with grandprotect ordinance or law (coverage 15) optional grandprotect - blank property limit platinum - 50% of cov.",
    "a increased limits available section ii - additional coverages additional coverages ho-2 homeowners, homeowners, market value, mobile homeowners, renters, condominium and landlord's rental platinum and grandprotect products (includes homeowners, renters and condominium) damage of property of others $500 $1500 claim expenses yes, including $200 for lost wages yes, including $250 for lost wages first aid expenses yes yes borrowed or rented watercraft n/a with grandprotect personal injury (coverage 25) optional included",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Dataset: dim_1024
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 1024
}
```

Metric	Value
cosine_accuracy@1	0.1471
cosine_accuracy@3	0.3333
cosine_accuracy@5	0.4118
cosine_accuracy@10	0.5294
cosine_precision@1	0.1471
cosine_precision@3	0.1111
cosine_precision@5	0.0824
cosine_precision@10	0.0529
cosine_recall@1	0.1471
cosine_recall@3	0.3333
cosine_recall@5	0.4118
cosine_recall@10	0.5294
cosine_ndcg@10	0.3255
cosine_mrr@10	0.262
cosine_map@100	0.2699

Information Retrieval

Dataset: dim_512
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 512
}
```

Metric	Value
cosine_accuracy@1	0.1471
cosine_accuracy@3	0.3039
cosine_accuracy@5	0.4118
cosine_accuracy@10	0.5098
cosine_precision@1	0.1471
cosine_precision@3	0.1013
cosine_precision@5	0.0824
cosine_precision@10	0.051
cosine_recall@1	0.1471
cosine_recall@3	0.3039
cosine_recall@5	0.4118
cosine_recall@10	0.5098
cosine_ndcg@10	0.3131
cosine_mrr@10	0.2519
cosine_map@100	0.2607

Information Retrieval

Dataset: dim_256
Evaluated with InformationRetrievalEvaluator with these parameters:
```
{
    "truncate_dim": 256
}
```

Metric	Value
cosine_accuracy@1	0.1275
cosine_accuracy@3	0.3235
cosine_accuracy@5	0.4118
cosine_accuracy@10	0.4902
cosine_precision@1	0.1275
cosine_precision@3	0.1078
cosine_precision@5	0.0824
cosine_precision@10	0.049
cosine_recall@1	0.1275
cosine_recall@3	0.3235
cosine_recall@5	0.4118
cosine_recall@10	0.4902
cosine_ndcg@10	0.3021
cosine_mrr@10	0.2428
cosine_map@100	0.2522

Training Details

Training Dataset

Unnamed Dataset

Size: 909 training samples
Columns: anchor, positive, and negative

Approximate statistics based on the first 909 samples:

	anchor	positive	negative
type	string	string	string
details	min: 11 tokens mean: 25.39 tokens max: 59 tokens	min: 35 tokens mean: 307.94 tokens max: 512 tokens	min: 13 tokens mean: 263.84 tokens max: 512 tokens

Samples:

anchor	positive	negative
`For a wood shake or shingle roof in good condition, what is the maximum age allowed for Replacement Cost coverage on a DP-3 policy?`	`roofing systems in fair condition do not qualify for replacement cost coverage. roofing systems in poor condition will have coverage for the roofing system limited to fire and lightning only regardless of age. roofing system age / condition of the roof system excellent condition good condition asphalt / composition 1-22 1-15 slate 1-35 1-28 metal 1-56 1-34 flat/built-up/roll n/a n/a tile 1-35 1-28 wood shake / shingle 1-13 1-8`	aegis california secondary residence insurance program 29 dp-man-ca (ed. 8) 16. roofs a. roof age the signed application will specifically disclose the age of the roof. the age of the roof is determined by subtracting the year the roof was installed from the year that the policy takes effect. the roof age will be updated manually at each policy renewal. if the roof age is updated or changed due to roof replacement, a copy of evidence (e.g. - copy of roof manufacturer's warranty indicating replacement date, copy of roof age disclosure statement from real estate transaction, receipt from roofing contractor) showing the date the roof was replaced must be submitted to the company. b. roof system type acceptable roof systems are as follows: 1. asphalt / composition - includes: (a) asphalt - shingle (fiberglass) (b) asphalt - shingle (architectural) (c) asphalt - shingle (architectural - hq) (d) composite - impact resistance shingle (e) composite - shake (f) composite - tile 2. slate - inclu...
`Which coverage form is used to insure the personal property of a tenant occupying a single-family dwelling or 1–4 family dwelling?`	american commerce insurance company ohio property program rules manual american commerce insurance company page 3 of 39 (04/20) form types ho3: special form- provides "open perils" coverage on the dwelling and other structures and "named perils" coverage on personal property. this policy may be written on an owner- occupied single-family, duplex, triplex, and fourplex dwelling used exclusively for private residential purposes with no more than 1 family per unit. at least one unit of the multi-family dwelling must be occupied by the insured. ho4: contents broad form - provides "named perils" coverage on the personal property of a tenant(s) occupying an apartment, townhouse, condominium, single-family dwelling or one unit in a 1-4 family dwelling used exclusively for private residential purposes with no more than 2 roomers or boarders. ho6: unit owners form - provides "named perils" coverage on building property and personal property for an insured who resides in an owner-occupied single...	`american commerce insurance company ohio property program rules manual american commerce insurance company page 4 of 39 (04/20) package policy requirements the following minimum limits apply to each form type. minimum package policy requirements ho3 ho4 ho6 cva base coverage -100% replacement cost n/a 10% of cvc cvb 10% of cva n/a n/a cvc 70% of cva base coverage base coverage cvd 20% of cva 40% of cvc 40% of cvc cvl $100,000 $100,000 $100,000 cvm $1,000 $1,000 $1,000`
`How does the manual define a seasonal dwelling?`	safeport insurance company homeowners program manual - south carolina (2020) general rules includes copyrighted material of insurance services office, inc. with its permission page 9 of 36 f. permitted business occupancies certain business occupancies are permitted, pro- vided: 1. the premises is occupied principally for private residential purposes, and 2. there is no other business occupancy on the premises. when the business is conducted on the residence premises, refer to rules 509. and 510. for section i coverage and rules 607. and 608. for section ii cov- erage. when it is conducted from an other resi- dence, only section ii coverage is available. refer to rules 607. and 608. g. farm property a homeowners policy shall not be issued to cover any property to which farm forms or rates apply under the rules of the company, except as noted in following paragraphs 1. and 2.: 1. section i - property - livestock collision coverage may be provided for loss due to colli- sion which results...	safeport insurance company homeowners program manual - south carolina (2020) general rules includes copyrighted material of insurance services office, inc. with its permission page 10 of 36 3. fire resistive exterior walls and floors and roof constructed of masonry or other fire resistive materials. e. mixed (masonry/frame) a combination of both fr ame and masonry construc- tion shall be classed as frame when the exterior walls of frame construction (including gables) exceed 33 1/3% of the total exterior wall area; otherwise class as masonry. rule 108. seasonal dwelling definition a seasonal dwelling is a dwelling with continuous un-oc- cupancy of three or more consecutive months during any one-year period. rule 109. single and separate buildings definition a. single building all buildings or sections of buildings which are acces- sible through unprotected openings shall be consid- ered as a single building. b. separate building 1. buildings which are separated by space shall be consid...

Loss: MatryoshkaLoss with these parameters:

{
    "loss": "TripletLoss",
    "matryoshka_dims": [
        1024,
        512,
        256
    ],
    "matryoshka_weights": [
        1,
        1,
        1
    ],
    "n_dims_per_step": -1
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: epoch
per_device_train_batch_size: 6
per_device_eval_batch_size: 16
gradient_accumulation_steps: 16
num_train_epochs: 8
lr_scheduler_type: cosine
warmup_ratio: 0.1
bf16: True
tf32: True
load_best_model_at_end: True
optim: adamw_torch_fused
push_to_hub: True
hub_model_id: bnkc123/obscura-v1
batch_sampler: no_duplicates

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: epoch
prediction_loss_only: True
per_device_train_batch_size: 6
per_device_eval_batch_size: 16
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 16
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 8
max_steps: -1
lr_scheduler_type: cosine
lr_scheduler_kwargs: {}
warmup_ratio: 0.1
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: True
fp16: False
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: True
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: True
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
tp_size: 0
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch_fused
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: True
resume_from_checkpoint: None
hub_model_id: bnkc123/obscura-v1
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: no_duplicates
multi_dataset_batch_sampler: proportional

Training Logs

Epoch	Step	Training Loss	dim_1024_cosine_ndcg@10	dim_512_cosine_ndcg@10	dim_256_cosine_ndcg@10
1.0	10	24.8465	0.5265	0.5079	0.5108
2.0	20	16.454	0.4701	0.4565	0.4235
3.0	30	9.4107	0.3821	0.3536	0.3599
4.0	40	4.786	0.3482	0.3464	0.3413
5.0	50	2.675	0.3266	0.3142	0.3150
6.0	60	1.542	0.3303	0.3161	0.3052
7.0	70	1.1167	0.3257	0.3131	0.3009
7.2105	72	-	0.3255	0.3131	0.3021

The bold row denotes the saved checkpoint.

Framework Versions

Python: 3.12.6
Sentence Transformers: 4.1.0
Transformers: 4.51.3
PyTorch: 2.7.0+cu126
Accelerate: 1.6.0
Datasets: 3.5.1
Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MatryoshkaLoss

@misc{kusupati2024matryoshka,
    title={Matryoshka Representation Learning},
    author={Aditya Kusupati and Gantavya Bhatt and Aniket Rege and Matthew Wallingford and Aditya Sinha and Vivek Ramanujan and William Howard-Snyder and Kaifeng Chen and Sham Kakade and Prateek Jain and Ali Farhadi},
    year={2024},
    eprint={2205.13147},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

bnkc123
/

obscura-v1

BGE large Financial Matryoshka

Model Details

Model Description

Model Sources

Full Model Architecture

Usage

Direct Usage (Sentence Transformers)

Evaluation

Metrics

Information Retrieval

Information Retrieval

Information Retrieval

Training Details

Training Dataset

Unnamed Dataset

Training Hyperparameters

Non-Default Hyperparameters

All Hyperparameters

Training Logs

Framework Versions

Citation

BibTeX

Sentence Transformers

MatryoshkaLoss

TripletLoss

Model tree for bnkc123/obscura-v1

Evaluation results