SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5

This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: Alibaba-NLP/gte-large-en-v1.5
  • Maximum Sequence Length: 8192 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("dpanea/skill-assignment-transformer")
# Run inference
sentences = [
    'What is the artefact?    This is catter this affect is called a cold   Almond   What are the features of the artefact?  The features of this artefact are it looks like   a gold snake with inscribed writing on the   inside   Question 2  What aspect of Ancient Roman society does this artefact represent?  This aspect represents the 1st century AD   What does the artefact tell us about Ancient Roman Society?  This artefact tells us about the 1st century AD The plantations keep the stones gave them girls How does this artefact give us an understanding about Ancient Roman society? it gives us an understanding about Ancient Roman society, because the plantations keep slaves and gift the stones. and forced them to wear it',
    'Writing Convention Skills: Conventions of Writing',
    'Sentence Construction Skills: I can construct basic sentences',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 11,779 training samples
  • Columns: sentence_0, sentence_1, and sentence_2
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 sentence_2
    type string string string
    details
    • min: 124 tokens
    • mean: 615.96 tokens
    • max: 1566 tokens
    • min: 7 tokens
    • mean: 19.72 tokens
    • max: 69 tokens
    • min: 6 tokens
    • mean: 19.55 tokens
    • max: 53 tokens
  • Samples:
    sentence_0 sentence_1 sentence_2
    2024 POETRY FEATURE ARTICLE – SCAFFOLD - blank
    Name:
    Song Chosen: SET IT ALL FREE
    Poem Chosen: STILL, I RISE
    Common theme: These form together to give the message of overcoming challenges and rising above difficulties with confidence and strength.
    ]
    THIS Scaffold could be submitted as your draft.
    HEADLINE: It needs to be strong, catchy and stimulate the reader. Try for ‘ear appeal’ or ‘brain appeal’ if you can. Possibly use alliteration or a pun. Just use the title of your poem until you can think of a title for the article. FOCUS BLUB: A brief, gripping sentence or two that lets readers know more specifically what the article is about. It gives a sense of the style of your piece. / Voiceworks - Whispers Of Wisdom Discover the themes of resilience and empowerment in Scarlett Johanssons “set it all Free” and mya Angelou’s “still I rise” I will explore how these works help us to overcome adversity and embrace our true strength...
    Emotionally Engaging Language: I can evoke an emotional response through emotive language. Reference Formatting Skills: Formats the reference list/bibliography correctly.
    Why is there no fuel for the next 500 kilometers? We need fuel and there is no way to turn back.This is such a bad time.We need fuel and i am gonna rage quit and drive us off the bridge if we can't get fuel any time soon pull over it's my turn, to drive you have been driving for the last hour and i want t go speeding, down this hill and get to the fuel station quicker, you drive way to slow and it is annoying me.Ok fine i'm pulling over.Finally ok i see that red car coming ,he wants to race and im racing him.ya i beat him but now we only have enough fuel for the next 200 km and the next fuel station is 250 km away i will drive until we run out of fuel then we will have to push and i'm paying for the fuel don't even think about paying for the fuel little brother.Ok time to push.No i am not pushing the car and you can not make me just because u are 1 year older than me does no mean can boss me around.Fine i will push lazy boy.What Why is the gas station shut down and the next one is 300k... Essay Organization Skills: Essay Writing Case Evaluation Skills: Does the student include discerning evaluation of ideas to support their case for positive change? 
    What is the artefact? the artefact is a gold armband. What are the features of the artefacts? the features on the arte fact it's a gold amband it looks like it beendigging to look like a snake rap around ur arm. you can see the snake scale's and and snake head on the amberd. Question 2 What aspect of Ancient Roman society does this artefact represent? the artefacts represent partion partion partian partian head tate were the richest people in human society it tells us that partions were the richest people in Aome Home society. patients were on of social What does the artefact tell us about Ancient Roman society? pyramid. they had all theexpertsn suf and they had Slaves How does this artefact give us an understanding about Ancient Roman society? the artefact gives us a understanding their were rich people and Cparthers) they had a late more money then all the others people in home society. 7 Spelling Visuals: Spelling visual - 4 Event Setting Visualization Skills: I can use technical vocabulary, contemporary language and images to create a sense of the event and the setting
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 4
  • per_device_eval_batch_size: 4
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 3
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • hub_revision: None
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • liger_kernel_config: None
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss
0.0340 100 -
0.0679 200 -
0.1019 300 -
0.1358 400 -
0.1698 500 1.7346
0.2037 600 -
0.2377 700 -
0.2716 800 -
0.3056 900 -
0.3396 1000 0.8428
0.3735 1100 -
0.4075 1200 -
0.4414 1300 -
0.4754 1400 -
0.5093 1500 0.4421
0.5433 1600 -
0.5772 1700 -
0.6112 1800 -
0.6452 1900 -
0.6791 2000 0.3366
0.7131 2100 -
0.7470 2200 -
0.7810 2300 -
0.8149 2400 -
0.8489 2500 0.2568
0.8829 2600 -
0.9168 2700 -
0.9508 2800 -
0.9847 2900 -
1.0 2945 -
1.0187 3000 0.1666
1.0526 3100 -
1.0866 3200 -
1.1205 3300 -
1.1545 3400 -
1.1885 3500 0.1027
1.2224 3600 -
1.2564 3700 -
1.2903 3800 -
1.3243 3900 -
1.3582 4000 0.0657
1.3922 4100 -
1.4261 4200 -
1.4601 4300 -
1.4941 4400 -
1.5280 4500 0.0788
1.5620 4600 -
1.5959 4700 -
1.6299 4800 -
1.6638 4900 -
1.6978 5000 0.0648
1.7317 5100 -
1.7657 5200 -
1.7997 5300 -
1.8336 5400 -
1.8676 5500 0.0413
1.9015 5600 -
1.9355 5700 -
1.9694 5800 -
2.0 5890 -
2.0034 5900 -
2.0374 6000 0.0293
2.0713 6100 -
2.1053 6200 -
2.1392 6300 -
2.1732 6400 -
2.2071 6500 0.0158
2.2411 6600 -
2.2750 6700 -
2.3090 6800 -
2.3430 6900 -
2.3769 7000 0.0183
2.4109 7100 -
2.4448 7200 -
2.4788 7300 -
2.5127 7400 -
2.5467 7500 0.0079
2.5806 7600 -
2.6146 7700 -
2.6486 7800 -
2.6825 7900 -
2.7165 8000 0.007
2.7504 8100 -
2.7844 8200 -
2.8183 8300 -
2.8523 8400 -
2.8862 8500 0.0057
2.9202 8600 -
2.9542 8700 -
2.9881 8800 -
3.0 8835 -

Framework Versions

  • Python: 3.10.12
  • Sentence Transformers: 4.1.0
  • Transformers: 4.53.0
  • PyTorch: 2.1.0+cu118
  • Accelerate: 1.8.1
  • Datasets: 3.6.0
  • Tokenizers: 0.21.2

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}
Downloads last month
0
Safetensors
Model size
434M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for dpanea/skill-assignment-transformer

Finetuned
(21)
this model