TalentCLEF-2025
Collection
Job to Job and Job to Skill matching sentence transformer models
•
9 items
•
Updated
Top performing model on TalentCLEF 2025 Task B. Use it for job title <-> skill set matching
SentenceTransformer(
(0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("pj-mathematician/JobSkillBGE-large-en-v1.5-v2")
# Run inference
sentences = [
"An insulation supervisor, regardless of the specific type of insulation material or installation area, requires strong project management skills, knowledge of building codes and safety regulations, and expertise in insulation techniques to oversee the installation process effectively and ensure quality standards are met.\n['insulation supervisor', 'supervisor of installation of insulating materials', 'supervisor of insulation materials installation', 'supervisor of installation of insulation', 'solid wall insulation installation supervisor', 'insulation installers supervisor', 'cavity wall insulation installation supervisor', 'loft insulation installation supervisor']",
"The skill of installing insulation material is primarily required by job roles such as insulation workers, HVAC technicians, and construction specialists, who are responsible for improving energy efficiency and thermal comfort in buildings by correctly fitting and fixing insulation materials in various structures.\n['install insulation material', 'insulate structure', 'fix insulation', 'insulation material installation', 'installation of insulation material', 'fitting insulation', 'insulating structure', 'installing insulation material', 'fixing insulation', 'fit insulation']",
"Job roles such as Food Safety Inspector, Public Health Officer, and Environmental Health Specialist require the skill of taking action on food safety violations to ensure compliance with health regulations and maintain public safety standards.\n['take action on food safety violations', 'invoke action on food safety violations', 'agree action on food safety violations', 'pursue action on food safety violations', 'determine action on food safety violations']",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
full_en
InformationRetrievalEvaluator
Metric | Value |
---|---|
cosine_accuracy@1 | 0.7204 |
cosine_accuracy@20 | 1.0 |
cosine_accuracy@50 | 1.0 |
cosine_accuracy@100 | 1.0 |
cosine_accuracy@150 | 1.0 |
cosine_accuracy@200 | 1.0 |
cosine_precision@1 | 0.7204 |
cosine_precision@20 | 0.4916 |
cosine_precision@50 | 0.387 |
cosine_precision@100 | 0.3046 |
cosine_precision@150 | 0.2575 |
cosine_precision@200 | 0.225 |
cosine_recall@1 | 0.0102 |
cosine_recall@20 | 0.1316 |
cosine_recall@50 | 0.2509 |
cosine_recall@100 | 0.3866 |
cosine_recall@150 | 0.4822 |
cosine_recall@200 | 0.5556 |
cosine_ndcg@1 | 0.7204 |
cosine_ndcg@20 | 0.5307 |
cosine_ndcg@50 | 0.4445 |
cosine_ndcg@100 | 0.4348 |
cosine_ndcg@150 | 0.4782 |
cosine_ndcg@200 | 0.5218 |
cosine_mrr@1 | 0.7204 |
cosine_mrr@20 | 0.8362 |
cosine_mrr@50 | 0.8362 |
cosine_mrr@100 | 0.8362 |
cosine_mrr@150 | 0.8362 |
cosine_mrr@200 | 0.8362 |
cosine_map@1 | 0.7204 |
cosine_map@20 | 0.335 |
cosine_map@50 | 0.2334 |
cosine_map@100 | 0.2057 |
cosine_map@150 | 0.2211 |
cosine_map@200 | 0.2387 |
cosine_map@500 | 0.288 |
anchor
and positive
anchor | positive | |
---|---|---|
type | string | string |
details |
|
|
anchor | positive |
---|---|
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles that require promoting health and safety include occupational health and safety specialists, safety managers, and public health educators, all of whom work to ensure safe and healthy environments in workplaces and communities. |
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles that require organizing rehearsals include directors, choreographers, and conductors in theater, dance, and music ensembles, who must efficiently plan and schedule practice sessions to prepare performers for a successful final performance. |
A technical director or any of its synonyms requires a strong blend of technical expertise and leadership skills, including the ability to oversee technical operations, manage teams, and ensure the successful execution of technical projects while maintaining operational efficiency and innovation. |
Job roles such as Health and Safety Managers, Environmental Health Officers, and Risk Management Specialists often require the skill of negotiating health and safety issues with third parties to ensure compliance and protection standards are met across different organizations and sites. |
CachedGISTEmbedLoss
with these parameters:{'guide': SentenceTransformer(
(0): Transformer({'max_seq_length': 128, 'do_lower_case': False}) with Transformer model: BertModel
(1): Pooling({'word_embedding_dimension': 384, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
), 'temperature': 0.01, 'mini_batch_size': 32, 'margin_strategy': 'absolute', 'margin': 0.0}
eval_strategy
: stepsper_device_train_batch_size
: 128per_device_eval_batch_size
: 128gradient_accumulation_steps
: 2num_train_epochs
: 5warmup_ratio
: 0.05log_on_each_node
: Falsefp16
: Truedataloader_num_workers
: 4ddp_find_unused_parameters
: Truebatch_sampler
: no_duplicatesoverwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 128per_device_eval_batch_size
: 128per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 2eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1.0num_train_epochs
: 5max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.05warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Falselogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Truefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Truedataloader_num_workers
: 4dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}tp_size
: 0fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Trueddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
: auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: no_duplicatesmulti_dataset_batch_sampler
: proportionalEpoch | Step | Training Loss | full_en_cosine_ndcg@200 |
---|---|---|---|
-1 | -1 | - | 0.4795 |
0.0022 | 1 | 10.6462 | - |
0.2232 | 100 | 4.5115 | - |
0.4464 | 200 | 2.9237 | 0.5255 |
0.6696 | 300 | 2.5327 | - |
0.8929 | 400 | 2.3451 | 0.5305 |
1.1161 | 500 | 1.9882 | - |
1.3393 | 600 | 1.7738 | 0.5240 |
1.5625 | 700 | 1.7365 | - |
1.7857 | 800 | 1.6932 | 0.5251 |
2.0089 | 900 | 1.6184 | - |
2.2321 | 1000 | 1.285 | 0.5254 |
2.4554 | 1100 | 1.2651 | - |
2.6786 | 1200 | 1.2739 | 0.5238 |
2.9018 | 1300 | 1.2625 | - |
3.125 | 1400 | 1.0726 | 0.5251 |
3.3482 | 1500 | 0.9606 | - |
3.5714 | 1600 | 0.9594 | 0.5214 |
3.7946 | 1700 | 0.954 | - |
4.0179 | 1800 | 0.9264 | 0.5239 |
4.2411 | 1900 | 0.7486 | - |
4.4643 | 2000 | 0.7424 | 0.5218 |
4.6875 | 2100 | 0.7127 | - |
4.9107 | 2200 | 0.7129 | 0.5218 |
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
Base model
BAAI/bge-large-en-v1.5