metadata

tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:1225740
  - loss:MultipleNegativesRankingLoss
base_model: BAAI/bge-m3
widget:
  - source_sentence: >-
      yoghurt cow chocolate chip sugar reduced half skimmed in plastic container
      commercial supermarket shop organic shop </s> This facet allows recording
      the place where the food was prepared for consumption. Only one descriptor
      from this facet can be added to each entry.
    sentences:
      - >-
        Product obtained during the processing of screened, dehusked barley into
        pearl barley, semolina or flour. It consists principally of particles of
        endosperm with fine fragments of the outer skins and some grain
        screenings.
      - Produced by industry in the form it arrives to the final consumer
      - >-
        Tree nuts from the plant classified under the species Corylus avellana
        L., commonly known as Hazelnuts or Cobnuts or Common hazelnut. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
  - source_sentence: >-
      sauce cold liquid preservation method onion mint croutons sweet pepper
      prepared at a restaurant </s> This facet collects ingredients and/or
      flavour note. Regarding ingredients this facet serves the purpose of
      providing information on ingredients of a composite food being important
      from some point of view, like allergic reactions, hazards, but also
      aspect, taste. The descriptors for this facet are taken from a selected
      subset of the main list (actually a relevant part of the food list). More
      (none contradicting) descriptors can be applied to each entry.
    sentences:
      - >-
        Spices from the fruits of the plant classified under the species Piper
        cubeba L. f., commonly known as Cubeb fruit or Tailed pepper. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
      - >-
        Tree nuts from the plant classified under the genus Juglans L. spp.,
        commonly known as Walnuts or Walnut Black or Walnut English or Walnut
        Persian. The part consumed/analysed is not specified. When relevant,
        information on the part consumed/analysed has to be reported with
        additional facet descriptors. In case of data collections related to
        legislations, the default part consumed/analysed is the one defined in
        the applicable legislation.
      - >-
        Fruiting vegetables from the plant classified under the species Capsicum
        annuum var. grossum (L.) Sendtner or Capsicum annuum var. longum Bailey,
        commonly known as Sweet peppers or Bell peppers or Paprika or
        PeppersLong or Pimento or Pimiento. The part consumed/analysed is not
        specified. When relevant, information on the part consumed/analysed has
        to be reported with additional facet descriptors. In case of data
        collections related to legislations, the default part consumed/analysed
        is the one defined in the applicable legislation.
  - source_sentence: >-
      yoghurt with fruits cow passion fruit sweetened with sugar sucrose fat
      content in plastic container commercial supermarket shop organic shop </s>
      This facet provides some principal claims related to important
      nutrients-ingredients, like fat, sugar etc. It is not intended to include
      health claims or similar. The present guidance provides a limited list, to
      be eventually improved during the evolution of the system. More than one
      descriptor can be applied to each entry, provided they are not
      contradicting each other.
    sentences:
      - >-
        Product where all or part of the sugar has been added during processing
        and is not naturally contained
      - >-
        Infusion materials from flowers of the plant classified under the genus
        Rosa L. spp., commonly known as Rose infusion flowers. The part
        consumed/analysed is not specified. When relevant, information on the
        part consumed/analysed has to be reported with additional facet
        descriptors. In case of data collections related to legislations, the
        default part consumed/analysed is the one defined in the applicable
        legislation.
      - >-
        Molecules providing intensive sweet sensation, used to substitute
        natural sugars in food formulas
  - source_sentence: >-
      pepper sweet green facets desc physical state form as quantified grated
      cooking method stir fried sauted preservation method fresh </s> This facet
      describes the form (physical aspect) of the food as reported by the
      consumer (as estimated during interview or as registered in the diary)
      (Consumption Data) or as expressed in the analysis results in the
      laboratory (Occurrence Data). Only one descriptor from this facet can be
      added to each entry, apart from the specification “with solid particles”.
      This facet should only be used in case of raw foods and ingredients (not
      for composite foods).
    sentences:
      - Unprocessed and not stored over any long period
      - >-
        Paste coarsely divided, where particles are still recognisable at naked
        eye
      - The food item is considered in its form with skin
  - source_sentence: >-
      tome des bauges raw milk aoc in plastic container brand product name </s>
      This facet allows recording whether the food list code was chosen because
      of lack of information on the food item or because the proper entry in the
      food list was missing. Only one descriptor from this facet can be added to
      each entry.
    sentences:
      - >-
        The food list item has been chosen because none of the more detailed
        items corresponded to the available information. Please consider the
        eventual addition of a new term in the list
      - >-
        The food item has a fat content which, when rounded with the standard
        rules of rounding, equals 25 % (weight/weight)
      - >-
        Deprecated term that must NOT be used for any purpose. Its original
        scopenote was: The group includes any type of Other fruiting vegetables
        (exposure). The part consumed/analysed is by default unspecified. When
        relevant, information on the part consumed/analysed has to be reported
        with additional facet descriptors.
pipeline_tag: sentence-similarity
library_name: sentence-transformers
metrics:
  - cosine_accuracy@1
  - cosine_accuracy@3
  - cosine_accuracy@5
  - cosine_accuracy@10
  - cosine_precision@1
  - cosine_precision@3
  - cosine_precision@5
  - cosine_precision@10
  - cosine_recall@1
  - cosine_recall@3
  - cosine_recall@5
  - cosine_recall@10
  - cosine_ndcg@10
  - cosine_mrr@10
  - cosine_map@100
model-index:
  - name: SentenceTransformer based on BAAI/bge-m3
    results:
      - task:
          type: device-aware-information-retrieval
          name: Device Aware Information Retrieval
        dataset:
          name: Unknown
          type: unknown
        metrics:
          - type: cosine_accuracy@1
            value: 0.9849655460430152
            name: Cosine Accuracy@1
          - type: cosine_accuracy@3
            value: 0.9989559406974317
            name: Cosine Accuracy@3
          - type: cosine_accuracy@5
            value: 0.9997911881394863
            name: Cosine Accuracy@5
          - type: cosine_accuracy@10
            value: 1
            name: Cosine Accuracy@10
          - type: cosine_precision@1
            value: 0.9849655460430152
            name: Cosine Precision@1
          - type: cosine_precision@3
            value: 0.41713649335282244
            name: Cosine Precision@3
          - type: cosine_precision@5
            value: 0.25370641052411774
            name: Cosine Precision@5
          - type: cosine_precision@10
            value: 0.12752140321570266
            name: Cosine Precision@10
          - type: cosine_recall@1
            value: 0.8690666019440294
            name: Cosine Recall@1
          - type: cosine_recall@3
            value: 0.993924343214383
            name: Cosine Recall@3
          - type: cosine_recall@5
            value: 0.998536283094646
            name: Cosine Recall@5
          - type: cosine_recall@10
            value: 0.9999462151268373
            name: Cosine Recall@10
          - type: cosine_ndcg@10
            value: 0.9936056206465634
            name: Cosine Ndcg@10
          - type: cosine_mrr@10
            value: 0.9919155008004455
            name: Cosine Mrr@10
          - type: cosine_map@100
            value: 0.9909164791232326
            name: Cosine Map@100

SentenceTransformer based on BAAI/bge-m3

This is a sentence-transformers model finetuned from BAAI/bge-m3. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Type: Sentence Transformer
Base model: BAAI/bge-m3
Maximum Sequence Length: 96 tokens
Output Dimensionality: 1024 dimensions
Similarity Function: Cosine Similarity

Model Sources

Documentation: Sentence Transformers Documentation
Repository: Sentence Transformers on GitHub
Hugging Face: Sentence Transformers on Hugging Face

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 96, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("disi-unibo-nlp/foodex-facet-descriptors-retriever")
# Run inference
sentences = [
    'tome des bauges raw milk aoc in plastic container brand product name </s> This facet allows recording whether the food list code was chosen because of lack of information on the food item or because the proper entry in the food list was missing. Only one descriptor from this facet can be added to each entry.',
    'The food list item has been chosen because none of the more detailed items corresponded to the available information. Please consider the eventual addition of a new term in the list',
    'Deprecated term that must NOT be used for any purpose. Its original scopenote was: The group includes any type of Other fruiting vegetables (exposure). The part consumed/analysed is by default unspecified. When relevant, information on the part consumed/analysed has to be reported with additional facet descriptors.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Device Aware Information Retrieval

Evaluated with src.utils.eval_functions.DeviceAwareInformationRetrievalEvaluator

Metric	Value
cosine_accuracy@1	0.985
cosine_accuracy@3	0.999
cosine_accuracy@5	0.9998
cosine_accuracy@10	1.0
cosine_precision@1	0.985
cosine_precision@3	0.4171
cosine_precision@5	0.2537
cosine_precision@10	0.1275
cosine_recall@1	0.8691
cosine_recall@3	0.9939
cosine_recall@5	0.9985
cosine_recall@10	0.9999
cosine_ndcg@10	0.9936
cosine_mrr@10	0.9919
cosine_map@100	0.9909

Training Details

Training Dataset

Unnamed Dataset

Size: 1,225,740 training samples
Columns: sentence_0, sentence_1, and sentence_2

Approximate statistics based on the first 1000 samples:

	sentence_0	sentence_1	sentence_2
type	string	string	string
details	min: 37 tokens mean: 89.82 tokens max: 96 tokens	min: 6 tokens mean: 39.38 tokens max: 96 tokens	min: 5 tokens mean: 39.59 tokens max: 96 tokens

Samples:

sentence_0	sentence_1	sentence_2
peach fresh flesh baked with skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.	`Cooking by dry heat in or as if in an oven`	`Previously cooked or heat-treated fodd, heated again in order to raise its temperature (all different techniques)`
turkey breast with bones frozen barbecued without skin This facet allows recording different characteristics of the food: preservation treatments a food item underwent, technological steps or treatments applied while producing a food item, the way a food item has been heat treated before consumption and the way a food item has been prepared for final consumption (particularly needed for consumption surveys and includes preparation (like battering or breading) as well as heat treatment steps). More (none contradicting) descriptors can be applied to each entry.	`Preserving by freezing sufficiently rapidly to avoid spoilage and microbial growth`	`Drying to a water content low enough to guarantee microbiological stability, but still keeping a relatively soft structure (often used for fruit)`
yoghurt flavoured cow blueberry sweetened with sugar sucrose whole in glass commercial supermarket shop organic shop brand product name This facet provides some principal claims related to important nutrients-ingredients, like fat, sugar etc. It is not intended to include health claims or similar. The present guidance provides a limited list, to be eventually improved during the evolution of the system. More than one descriptor can be applied to each entry, provided they are not contradicting each other.	`The food item has all the natural (or average expected )fat content (for milk, at least the value defined in legislation, when available). In the case of cheese, the fat on the dry matter is 45-60%`	`The food item has an almost completely reduced amount of fat, with respect to the expected natural fat content (for milk, at least the value defined in legislation, when available). For meat, this is the entry for what is commercially intended as 'lean' meat, where fat is not visible.In the case of cheese, the fat on the dry matter is 10-25%`

Loss: MultipleNegativesRankingLoss with these parameters:

{
    "scale": 20.0,
    "similarity_fct": "cos_sim"
}

Training Hyperparameters

Non-Default Hyperparameters

eval_strategy: steps
per_device_train_batch_size: 48
per_device_eval_batch_size: 48
fp16: True
multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand

overwrite_output_dir: False
do_predict: False
eval_strategy: steps
prediction_loss_only: True
per_device_train_batch_size: 48
per_device_eval_batch_size: 48
per_gpu_train_batch_size: None
per_gpu_eval_batch_size: None
gradient_accumulation_steps: 1
eval_accumulation_steps: None
torch_empty_cache_steps: None
learning_rate: 5e-05
weight_decay: 0.0
adam_beta1: 0.9
adam_beta2: 0.999
adam_epsilon: 1e-08
max_grad_norm: 1.0
num_train_epochs: 3
max_steps: -1
lr_scheduler_type: linear
lr_scheduler_kwargs: {}
warmup_ratio: 0.0
warmup_steps: 0
log_level: passive
log_level_replica: warning
log_on_each_node: True
logging_nan_inf_filter: True
save_safetensors: True
save_on_each_node: False
save_only_model: False
restore_callback_states_from_checkpoint: False
no_cuda: False
use_cpu: False
use_mps_device: False
seed: 42
data_seed: None
jit_mode_eval: False
use_ipex: False
bf16: False
fp16: True
fp16_opt_level: O1
half_precision_backend: auto
bf16_full_eval: False
fp16_full_eval: False
tf32: None
local_rank: 0
ddp_backend: None
tpu_num_cores: None
tpu_metrics_debug: False
debug: []
dataloader_drop_last: False
dataloader_num_workers: 0
dataloader_prefetch_factor: None
past_index: -1
disable_tqdm: False
remove_unused_columns: True
label_names: None
load_best_model_at_end: False
ignore_data_skip: False
fsdp: []
fsdp_min_num_params: 0
fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
fsdp_transformer_layer_cls_to_wrap: None
accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
deepspeed: None
label_smoothing_factor: 0.0
optim: adamw_torch
optim_args: None
adafactor: False
group_by_length: False
length_column_name: length
ddp_find_unused_parameters: None
ddp_bucket_cap_mb: None
ddp_broadcast_buffers: False
dataloader_pin_memory: True
dataloader_persistent_workers: False
skip_memory_metrics: True
use_legacy_prediction_loop: False
push_to_hub: False
resume_from_checkpoint: None
hub_model_id: None
hub_strategy: every_save
hub_private_repo: None
hub_always_push: False
gradient_checkpointing: False
gradient_checkpointing_kwargs: None
include_inputs_for_metrics: False
include_for_metrics: []
eval_do_concat_batches: True
fp16_backend: auto
push_to_hub_model_id: None
push_to_hub_organization: None
mp_parameters:
auto_find_batch_size: False
full_determinism: False
torchdynamo: None
ray_scope: last
ddp_timeout: 1800
torch_compile: False
torch_compile_backend: None
torch_compile_mode: None
dispatch_batches: None
split_batches: None
include_tokens_per_second: False
include_num_input_tokens_seen: False
neftune_noise_alpha: None
optim_target_modules: None
batch_eval_metrics: False
eval_on_start: False
use_liger_kernel: False
eval_use_gather_object: False
average_tokens_across_devices: False
prompts: None
batch_sampler: batch_sampler
multi_dataset_batch_sampler: round_robin

Training Logs

Click to expand

Epoch	Step	Training Loss	cosine_ndcg@10
0	0	-	0.0266
0.0196	500	1.5739	-
0.0392	1000	0.9043	-
0.0587	1500	0.8234	-
0.0783	2000	0.7861	-
0.0979	2500	0.7628	-
0.1175	3000	0.7348	-
0.1371	3500	0.7184	-
0.1566	4000	0.7167	-
0.1762	4500	0.7002	-
0.1958	5000	0.6791	0.9264
0.2154	5500	0.6533	-
0.2350	6000	0.6628	-
0.2545	6500	0.6637	-
0.2741	7000	0.639	-
0.2937	7500	0.6395	-
0.3133	8000	0.6358	-
0.3329	8500	0.617	-
0.3524	9000	0.6312	-
0.3720	9500	0.6107	-
0.3916	10000	0.6083	0.9518
0.4112	10500	0.6073	-
0.4307	11000	0.601	-
0.4503	11500	0.6047	-
0.4699	12000	0.5986	-
0.4895	12500	0.5913	-
0.5091	13000	0.5992	-
0.5286	13500	0.5911	-
0.5482	14000	0.5923	-
0.5678	14500	0.5816	-
0.5874	15000	0.582	0.9628
0.6070	15500	0.5815	-
0.6265	16000	0.5827	-
0.6461	16500	0.5885	-
0.6657	17000	0.5737	-
0.6853	17500	0.577	-
0.7049	18000	0.5687	-
0.7244	18500	0.5744	-
0.7440	19000	0.5774	-
0.7636	19500	0.5792	-
0.7832	20000	0.5645	0.9739
0.8028	20500	0.5769	-
0.8223	21000	0.5659	-
0.8419	21500	0.5635	-
0.8615	22000	0.5677	-
0.8811	22500	0.5693	-
0.9007	23000	0.5666	-
0.9202	23500	0.5526	-
0.9398	24000	0.5591	-
0.9594	24500	0.563	-
0.9790	25000	0.555	0.9808
0.9986	25500	0.5585	-
1.0	25537	-	0.9811
1.0181	26000	0.5595	-
1.0377	26500	0.5507	-
1.0573	27000	0.5582	-
1.0769	27500	0.5543	-
1.0964	28000	0.5598	-
1.1160	28500	0.5613	-
1.1356	29000	0.5457	-
1.1552	29500	0.5524	-
1.1748	30000	0.5324	0.9836
1.1943	30500	0.5531	-
1.2139	31000	0.5505	-
1.2335	31500	0.5623	-
1.2531	32000	0.5505	-
1.2727	32500	0.5583	-
1.2922	33000	0.548	-
1.3118	33500	0.5485	-
1.3314	34000	0.5509	-
1.3510	34500	0.54	-
1.3706	35000	0.5478	0.9835
1.3901	35500	0.5416	-
1.4097	36000	0.5438	-
1.4293	36500	0.543	-
1.4489	37000	0.547	-
1.4685	37500	0.5362	-
1.4880	38000	0.5536	-
1.5076	38500	0.5356	-
1.5272	39000	0.5382	-
1.5468	39500	0.5481	-
1.5664	40000	0.5302	0.9880
1.5859	40500	0.5275	-
1.6055	41000	0.5327	-
1.6251	41500	0.5414	-
1.6447	42000	0.5354	-
1.6643	42500	0.536	-
1.6838	43000	0.5364	-
1.7034	43500	0.5391	-
1.7230	44000	0.5342	-
1.7426	44500	0.5369	-
1.7621	45000	0.5387	0.9894
1.7817	45500	0.5312	-
1.8013	46000	0.5297	-
1.8209	46500	0.5222	-
1.8405	47000	0.5255	-
1.8600	47500	0.5379	-
1.8796	48000	0.5317	-
1.8992	48500	0.5312	-
1.9188	49000	0.5307	-
1.9384	49500	0.5375	-
1.9579	50000	0.527	0.9908
1.9775	50500	0.538	-
1.9971	51000	0.5312	-
2.0	51074	-	0.9911
2.0167	51500	0.5346	-
2.0363	52000	0.5279	-
2.0558	52500	0.517	-
2.0754	53000	0.5193	-
2.0950	53500	0.5286	-
2.1146	54000	0.5229	-
2.1342	54500	0.5183	-
2.1537	55000	0.5194	0.9915
2.1733	55500	0.5362	-
2.1929	56000	0.5186	-
2.2125	56500	0.5202	-
2.2321	57000	0.5276	-
2.2516	57500	0.5266	-
2.2712	58000	0.5334	-
2.2908	58500	0.5206	-
2.3104	59000	0.5229	-
2.3300	59500	0.5111	-
2.3495	60000	0.5175	0.9928
2.3691	60500	0.5235	-
2.3887	61000	0.5127	-
2.4083	61500	0.5291	-
2.4278	62000	0.5122	-
2.4474	62500	0.5196	-
2.4670	63000	0.5159	-
2.4866	63500	0.5207	-
2.5062	64000	0.5157	-
2.5257	64500	0.5094	-
2.5453	65000	0.5283	0.9937
2.5649	65500	0.5256	-
2.5845	66000	0.524	-
2.6041	66500	0.5324	-
2.6236	67000	0.5132	-
2.6432	67500	0.5203	-
2.6628	68000	0.5224	-
2.6824	68500	0.5255	-
2.7020	69000	0.5132	-
2.7215	69500	0.525	-
2.7411	70000	0.5257	0.9936
2.7607	70500	0.5206	-
2.7803	71000	0.514	-
2.7999	71500	0.5175	-
2.8194	72000	0.5245	-
2.8390	72500	0.5144	-
2.8586	73000	0.5246	-
2.8782	73500	0.5227	-
2.8978	74000	0.5199	-
2.9173	74500	0.5216	-
2.9369	75000	0.5253	0.9936
2.9565	75500	0.5303	-
2.9761	76000	0.5148	-
2.9957	76500	0.5248	-
3.0	76611	-	0.9936

Framework Versions

Python: 3.10.12
Sentence Transformers: 3.4.1
Transformers: 4.49.0
PyTorch: 2.6.0+cu124
Accelerate: 1.4.0
Datasets: 3.3.1
Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}