Thai Food Ingredients → Dish Prediction

This is a sentence-transformers model finetuned from sentence-transformers/paraphrase-multilingual-mpnet-base-v2. It maps sentences & paragraphs to a 768-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: XLMRobertaModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("thai_food_prediction1")
# Run inference
sentences = [
    'ปลาทูน่า,  พริกขี้หนู,  ไข่ไก่,  น้ำปลา, เล็กน้อย, น้ำมันพืช',
    'ไข่เจียวทูน่าพริกสับ',
    'ข้าวแต๋น',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 768]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Evaluation

Metrics

Information Retrieval

Metric Value
cosine_accuracy@1 0.6053
cosine_accuracy@3 0.8421
cosine_accuracy@5 0.9342
cosine_accuracy@10 0.9737
cosine_precision@1 0.6053
cosine_precision@3 0.2807
cosine_precision@5 0.1868
cosine_recall@1 0.6053
cosine_recall@3 0.8421
cosine_recall@5 0.9342
cosine_ndcg@10 0.789
cosine_mrr@10 0.7292
cosine_map@100 0.7302

Training Details

Training Dataset

Unnamed Dataset

  • Size: 2,452 training samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 1000 samples:
    anchor positive
    type string string
    details
    • min: 7 tokens
    • mean: 29.15 tokens
    • max: 125 tokens
    • min: 4 tokens
    • mean: 9.91 tokens
    • max: 22 tokens
  • Samples:
    anchor positive
    ปลาหมึก, ซีอิ๊วดำ, ผงขมิ้น, น้ำปูนใส, กระเทียมสับ, รากผักชี, พริกแดง, น้ำตาลปี๊บ, เกลือ, น้ำปลา, น้ำมะนาว ปลาหมึกย่าง
    ไปตกหมึกมา อยากทำอะไรกินง่ายๆ ได้รสชาติของปลาหมึกแท้ๆ ปลาหมึกย่าง
    อยากกินปลาหมึกๆ ซีฟุ้ด อร่อยๆ ปลาหมึกย่าง
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 76 evaluation samples
  • Columns: anchor and positive
  • Approximate statistics based on the first 76 samples:
    anchor positive
    type string string
    details
    • min: 4 tokens
    • mean: 47.42 tokens
    • max: 86 tokens
    • min: 4 tokens
    • mean: 10.24 tokens
    • max: 20 tokens
  • Samples:
    anchor positive
    น้ำมันพืช, กระเทียม, น้ำตาลทราย, น้ำปลา, ซีอิ๊วขาว, ซอสปรุงรส, ซีอิ๊วดำเค็ม, น้ำส้มสายชู, พริกไทย, เส้นหมี่แห้ง, ลูกชิ้น, ถั่วงอก หมี่คลุก
    น้ำมัน, กระเทียม, หมูหมัก, เส้นใหญ่, ซีอิ้วดำ, คะน้า, กระหล่ำปลี, แครอท, ไข่เป็ด, ไข่ไก่, ผงปรุงรส, น้ำตาลทราย, ซอสหอยนางรม, ซอสปรุงรส, พริกไทย ผัดซีอิ้วเส้นใหญ่
    สะโพกหมู, น้ำตาลทราย, น้ำตาลปี๊บ, ซีอิ๊วขาว, เกลือ, น้ำเปล่า, ลูกผักชี, ยี่หร่า, กระเทียมไทย, สับละเอียด, น้ำมันพืช หมูสวรรค์
  • Loss: MultipleNegativesRankingLoss with these parameters:
    {
        "scale": 20.0,
        "similarity_fct": "cos_sim"
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: epoch
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • learning_rate: 5e-06
  • num_train_epochs: 6
  • warmup_ratio: 0.1
  • load_best_model_at_end: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: epoch
  • prediction_loss_only: True
  • per_device_train_batch_size: 24
  • per_device_eval_batch_size: 24
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-06
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 6
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: True
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss Validation Loss thai-food-eval_cosine_ndcg@10
0.0971 10 3.2532 - -
0.1942 20 2.6975 - -
0.2913 30 2.3365 - -
0.3883 40 1.9787 - -
0.4854 50 1.9125 - -
0.5825 60 1.7024 - -
0.6796 70 1.6074 - -
0.7767 80 1.3358 - -
0.8738 90 1.4281 - -
0.9709 100 1.4312 - -
1.0 103 - 0.9767 0.6681
1.0680 110 1.1873 - -
1.1650 120 1.1148 - -
1.2621 130 1.1163 - -
1.3592 140 1.0429 - -
1.4563 150 0.9594 - -
1.5534 160 0.9593 - -
1.6505 170 1.0314 - -
1.7476 180 1.0236 - -
1.8447 190 1.0052 - -
1.9417 200 1.0062 - -
2.0 206 - 0.6975 0.7435
2.0388 210 0.8259 - -
2.1359 220 0.6713 - -
2.2330 230 0.7833 - -
2.3301 240 0.8613 - -
2.4272 250 0.6706 - -
2.5243 260 0.8971 - -
2.6214 270 0.7678 - -
2.7184 280 0.741 - -
2.8155 290 0.6872 - -
2.9126 300 0.7854 - -
3.0 309 - 0.6185 0.7481
3.0097 310 0.7095 - -
3.1068 320 0.6708 - -
3.2039 330 0.6311 - -
3.3010 340 0.6769 - -
3.3981 350 0.5816 - -
3.4951 360 0.6604 - -
3.5922 370 0.6356 - -
3.6893 380 0.5459 - -
3.7864 390 0.5856 - -
3.8835 400 0.6812 - -
3.9806 410 0.5893 - -
4.0 412 - 0.5796 0.7742
4.0777 420 0.4721 - -
4.1748 430 0.4353 - -
4.2718 440 0.5372 - -
4.3689 450 0.6343 - -
4.4660 460 0.6572 - -
4.5631 470 0.601 - -
4.6602 480 0.5418 - -
4.7573 490 0.5312 - -
4.8544 500 0.5055 - -
4.9515 510 0.5447 - -
5.0 515 - 0.5373 0.7877
5.0485 520 0.5501 - -
5.1456 530 0.5831 - -
5.2427 540 0.5378 - -
5.3398 550 0.4975 - -
5.4369 560 0.5326 - -
5.5340 570 0.3991 - -
5.6311 580 0.473 - -
5.7282 590 0.4915 - -
5.8252 600 0.4234 - -
5.9223 610 0.5445 - -
6.0 618 - 0.5209 0.789
  • The bold row denotes the saved checkpoint.

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.53.0
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.8.1
  • Datasets: 2.14.4
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

MultipleNegativesRankingLoss

@misc{henderson2017efficient,
    title={Efficient Natural Language Response Suggestion for Smart Reply},
    author={Matthew Henderson and Rami Al-Rfou and Brian Strope and Yun-hsuan Sung and Laszlo Lukacs and Ruiqi Guo and Sanjiv Kumar and Balint Miklos and Ray Kurzweil},
    year={2017},
    eprint={1705.00652},
    archivePrefix={arXiv},
    primaryClass={cs.CL}
}
Downloads last month
1
Safetensors
Model size
278M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Chanisorn/thai-food-mpnet-new-v10

Evaluation results