LorMolf's picture
Add new SentenceTransformer model
ff89341 verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1225740
  - loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-m3
widget:
  - source_sentence: >-
      yoghurt cow chocolate chip sugar reduced half skimmed in plastic container
      commercial supermarket shop organic shop </s> This facet allows recording
      the place where the food was prepared for consumption. Only one descriptor
      from this facet can be added to each entry.
    sentences:
      - >-
        Product obtained during the processing of screened, dehusked barley into
        pearl barley, semolina or flour. It consists principally of particles of
        endosperm with fine fragments of the outer skins and some grain
        screenings.
      - Produced by industry in the form it arrives to the final consumer
      - >-
        Tree nuts from the plant classified under the species Corylus avellana
        L., commonly known as Hazelnuts or Cobnuts or Common hazelnut. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
  - source_sentence: >-
      sauce cold liquid preservation method onion mint croutons sweet pepper
      prepared at a restaurant </s> This facet collects ingredients and/or
      flavour note. Regarding ingredients this facet serves the purpose of
      providing information on ingredients of a composite food being important
      from some point of view, like allergic reactions, hazards, but also
      aspect, taste. The descriptors for this facet are taken from a selected
      subset of the main list (actually a relevant part of the food list). More
      (none contradicting) descriptors can be applied to each entry.
    sentences:
      - >-
        Spices from the fruits of the plant classified under the species Piper
        cubeba L. f., commonly known as Cubeb fruit or Tailed pepper. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
      - >-
        Tree nuts from the plant classified under the genus Juglans L. spp.,
        commonly known as Walnuts or Walnut Black or Walnut English or Walnut
        Persian. The part consumed/analysed is not specified. When relevant,
        information on the part consumed/analysed has to be reported with
        additional facet descriptors. In case of data collections related to
        legislations, the default part consumed/analysed is the one defined in
        the applicable legislation.
      - >-
        Fruiting vegetables from the plant classified under the species Capsicum
        annuum var. grossum (L.) Sendtner or Capsicum annuum var. longum Bailey,
        commonly known as Sweet peppers or Bell peppers or Paprika or
        PeppersLong or Pimento or Pimiento. The part consumed/analysed is not
        specified. When relevant, information on the part consumed/analysed has
        to be reported with additional facet descriptors. In case of data
        collections related to legislations, the default part consumed/analysed
        is the one defined in the applicable legislation.
  - source_sentence: >-
      yoghurt with fruits cow passion fruit sweetened with sugar sucrose fat
      content in plastic container commercial supermarket shop organic shop </s>
      This facet provides some principal claims related to important
      nutrients-ingredients, like fat, sugar etc. It is not intended to include
      health claims or similar. The present guidance provides a limited list, to
      be eventually improved during the evolution of the system. More than one
      descriptor can be applied to each entry, provided they are not
      contradicting each other.
    sentences:
      - >-
        Product where all or part of the sugar has been added during processing
        and is not naturally contained
      - >-
        Infusion materials from flowers of the plant classified under the genus
        Rosa L. spp., commonly known as Rose infusion flowers. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
      - >-
        Molecules providing intensive sweet sensation, used to substitute
        natural sugars in food formulas
  - source_sentence: >-
      pepper sweet green facets desc physical state form as quantified grated
      cooking method stir fried sauted preservation method fresh </s> This facet
      describes the form (physical aspect) of the food as reported by the
      consumer (as estimated during interview or as registered in the diary)
      (Consumption Data) or as expressed in the analysis results in the
      laboratory (Occurrence Data). Only one descriptor from this facet can be
      added to each entry, apart from the specification “with solid particles”.
      This facet should only be used in case of raw foods and ingredients (not
      for composite foods).
    sentences:
      - Unprocessed and not stored over any long period
      - >-
        Paste coarsely divided, where particles are still recognisable at naked
        eye
      - The food item is considered in its form with skin
  - source_sentence: >-
      tome des bauges raw milk aoc in plastic container brand product name </s>
      This facet allows recording whether the food list code was chosen because
      of lack of information on the food item or because the proper entry in the
      food list was missing. Only one descriptor from this facet can be added to
      each entry.
    sentences:
      - >-
        The food list item has been chosen because none of the more detailed
        items corresponded to the available information. Please consider the
        eventual addition of a new term in the list
      - >-
        The food item has a fat content which, when rounded with the standard
        rules of rounding, equals 25 % (weight/weight)
      - >-
        Deprecated term that must NOT be used for any purpose. Its original
        scopenote was: The group includes any type of Other fruiting vegetables
        (exposure). The part consumed/analysed is by default unspecified. When
        relevant, information on the part consumed/analysed has to be reported
        with additional facet descriptors.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on BAAI/bge-m3
    results:
      - task:
          type: device-aware-information-retrieval
          name: Device Aware Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.9849655460430152
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9989559406974317
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9997911881394863
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9849655460430152
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.41713649335282244
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.25370641052411774
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.12752140321570266
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8690666019440294
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.993924343214383
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.998536283094646
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9999462151268373
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9936056206465634
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9919155008004455
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9909164791232326
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-m3
  • Maximum Sequence Length: 96 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 96, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("disi-unibo-nlp/foodex-facet-descriptors-retriever")
# Run inference
sentences = [
    'tome des bauges raw milk aoc in plastic container brand product name </s> This facet allows recording whether the food list code was chosen because of lack of information on the food item or because the proper entry in the food list was missing. Only one descriptor from this facet can be added to each entry.',
    'The food list item has been chosen because none of the more detailed items corresponded to the available information. Please consider the eventual addition of a new term in the list',
    'Deprecated term that must NOT be used for any purpose. Its original scopenote was: The group includes any type of Other fruiting vegetables (exposure). The part consumed/analysed is by default unspecified. When relevant, information on the part consumed/analysed has to be reported with additional facet descriptors.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Device Aware Information Retrieval

  • Evaluated with src.utils.eval_functions.DeviceAwareInformationRetrievalEvaluator
Metric Value
cosine_accuracy@1 0.985
cosine_accuracy@3 0.999
cosine_accuracy@5 0.9998
cosine_accuracy@10 1.0
cosine_precision@1 0.985
cosine_precision@3 0.4171
cosine_precision@5 0.2537
cosine_precision@10 0.1275
cosine_recall@1 0.8691
cosine_recall@3 0.9939
cosine_recall@5 0.9985
cosine_recall@10 0.9999
cosine_ndcg@10 0.9936
cosine_mrr@10 0.9919
cosine_map@100 0.9909

Training Details

Training Dataset

Unnamed Dataset

  • Size: 1,225,740 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 37 tokens
    • mean: 89.82 tokens
    • max: 96 tokens
    • min: 6 tokens
    • mean: 39.38 tokens
    • max: 96 tokens
    • min: 5 tokens
    • mean: 39.59 tokens
    • max: 96 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    peach fresh flesh baked with skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry. Cooking by dry heat in or as if in an oven Previously cooked or heat-treated fodd, heated again in order to raise its temperature (all different techniques)
    turkey breast with bones frozen barbecued without skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry. Preserving by freezing sufficiently rapidly to avoid spoilage and microbial growth Drying to a water content low enough to guarantee microbiological stability, but still keeping a relatively soft structure (often used for fruit)
    yoghurt flavoured cow blueberry sweetened with sugar sucrose whole in glass commercial supermarket shop organic shop brand product name This facet provides some principal claims related to important nutrients-ingredients, like fat, sugar etc. It is not intended to include health claims or similar. The present guidance provides a limited list, to be eventually improved during the evolution of the system. More than one descriptor can be applied to each entry, provided they are not contradicting each other. The food item has all the natural (or average expected )fat content (for milk, at least the value defined in legislation, when available). In the case of cheese, the fat on the dry matter is 45-60% The food item has an almost completely reduced amount of fat, with respect to the expected natural fat content (for milk, at least the value defined in legislation, when available). For meat, this is the entry for what is commercially intended as 'lean' meat, where fat is not visible.In the case of cheese, the fat on the dry matter is 10-25%
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 48
  • per_device_eval_batch_size: 48
  • fp16: True
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 48
  • per_device_eval_batch_size: 48
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand
Epoch Step Training Loss cosine_ndcg@10
0 0 - 0.0266
0.0196 500 1.5739 -
0.0392 1000 0.9043 -
0.0587 1500 0.8234 -
0.0783 2000 0.7861 -
0.0979 2500 0.7628 -
0.1175 3000 0.7348 -
0.1371 3500 0.7184 -
0.1566 4000 0.7167 -
0.1762 4500 0.7002 -
0.1958 5000 0.6791 0.9264
0.2154 5500 0.6533 -
0.2350 6000 0.6628 -
0.2545 6500 0.6637 -
0.2741 7000 0.639 -
0.2937 7500 0.6395 -
0.3133 8000 0.6358 -
0.3329 8500 0.617 -
0.3524 9000 0.6312 -
0.3720 9500 0.6107 -
0.3916 10000 0.6083 0.9518
0.4112 10500 0.6073 -
0.4307 11000 0.601 -
0.4503 11500 0.6047 -
0.4699 12000 0.5986 -
0.4895 12500 0.5913 -
0.5091 13000 0.5992 -
0.5286 13500 0.5911 -
0.5482 14000 0.5923 -
0.5678 14500 0.5816 -
0.5874 15000 0.582 0.9628
0.6070 15500 0.5815 -
0.6265 16000 0.5827 -
0.6461 16500 0.5885 -
0.6657 17000 0.5737 -
0.6853 17500 0.577 -
0.7049 18000 0.5687 -
0.7244 18500 0.5744 -
0.7440 19000 0.5774 -
0.7636 19500 0.5792 -
0.7832 20000 0.5645 0.9739
0.8028 20500 0.5769 -
0.8223 21000 0.5659 -
0.8419 21500 0.5635 -
0.8615 22000 0.5677 -
0.8811 22500 0.5693 -
0.9007 23000 0.5666 -
0.9202 23500 0.5526 -
0.9398 24000 0.5591 -
0.9594 24500 0.563 -
0.9790 25000 0.555 0.9808
0.9986 25500 0.5585 -
1.0 25537 - 0.9811
1.0181 26000 0.5595 -
1.0377 26500 0.5507 -
1.0573 27000 0.5582 -
1.0769 27500 0.5543 -
1.0964 28000 0.5598 -
1.1160 28500 0.5613 -
1.1356 29000 0.5457 -
1.1552 29500 0.5524 -
1.1748 30000 0.5324 0.9836
1.1943 30500 0.5531 -
1.2139 31000 0.5505 -
1.2335 31500 0.5623 -
1.2531 32000 0.5505 -
1.2727 32500 0.5583 -
1.2922 33000 0.548 -
1.3118 33500 0.5485 -
1.3314 34000 0.5509 -
1.3510 34500 0.54 -
1.3706 35000 0.5478 0.9835
1.3901 35500 0.5416 -
1.4097 36000 0.5438 -
1.4293 36500 0.543 -
1.4489 37000 0.547 -
1.4685 37500 0.5362 -
1.4880 38000 0.5536 -
1.5076 38500 0.5356 -
1.5272 39000 0.5382 -
1.5468 39500 0.5481 -
1.5664 40000 0.5302 0.9880
1.5859 40500 0.5275 -
1.6055 41000 0.5327 -
1.6251 41500 0.5414 -
1.6447 42000 0.5354 -
1.6643 42500 0.536 -
1.6838 43000 0.5364 -
1.7034 43500 0.5391 -
1.7230 44000 0.5342 -
1.7426 44500 0.5369 -
1.7621 45000 0.5387 0.9894
1.7817 45500 0.5312 -
1.8013 46000 0.5297 -
1.8209 46500 0.5222 -
1.8405 47000 0.5255 -
1.8600 47500 0.5379 -
1.8796 48000 0.5317 -
1.8992 48500 0.5312 -
1.9188 49000 0.5307 -
1.9384 49500 0.5375 -
1.9579 50000 0.527 0.9908
1.9775 50500 0.538 -
1.9971 51000 0.5312 -
2.0 51074 - 0.9911
2.0167 51500 0.5346 -
2.0363 52000 0.5279 -
2.0558 52500 0.517 -
2.0754 53000 0.5193 -
2.0950 53500 0.5286 -
2.1146 54000 0.5229 -
2.1342 54500 0.5183 -
2.1537 55000 0.5194 0.9915
2.1733 55500 0.5362 -
2.1929 56000 0.5186 -
2.2125 56500 0.5202 -
2.2321 57000 0.5276 -
2.2516 57500 0.5266 -
2.2712 58000 0.5334 -
2.2908 58500 0.5206 -
2.3104 59000 0.5229 -
2.3300 59500 0.5111 -
2.3495 60000 0.5175 0.9928
2.3691 60500 0.5235 -
2.3887 61000 0.5127 -
2.4083 61500 0.5291 -
2.4278 62000 0.5122 -
2.4474 62500 0.5196 -
2.4670 63000 0.5159 -
2.4866 63500 0.5207 -
2.5062 64000 0.5157 -
2.5257 64500 0.5094 -
2.5453 65000 0.5283 0.9937
2.5649 65500 0.5256 -
2.5845 66000 0.524 -
2.6041 66500 0.5324 -
2.6236 67000 0.5132 -
2.6432 67500 0.5203 -
2.6628 68000 0.5224 -
2.6824 68500 0.5255 -
2.7020 69000 0.5132 -
2.7215 69500 0.525 -
2.7411 70000 0.5257 0.9936
2.7607 70500 0.5206 -
2.7803 71000 0.514 -
2.7999 71500 0.5175 -
2.8194 72000 0.5245 -
2.8390 72500 0.5144 -
2.8586 73000 0.5246 -
2.8782 73500 0.5227 -
2.8978 74000 0.5199 -
2.9173 74500 0.5216 -
2.9369 75000 0.5253 0.9936
2.9565 75500 0.5303 -
2.9761 76000 0.5148 -
2.9957 76500 0.5248 -
3.0 76611 - 0.9936

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 3.4.1
  • Transformers: 4.49.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.4.0
  • Datasets: 3.3.1
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}