SentenceTransformer based on Alibaba-NLP/gte-large-en-v1.5
This is a sentence-transformers model finetuned from Alibaba-NLP/gte-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: Alibaba-NLP/gte-large-en-v1.5
- Maximum Sequence Length: 8192 tokens
- Output Dimensionality: 1024 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 8192, 'do_lower_case': False}) with Transformer model: NewModel
(1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
# Download from the 🤗 Hub
model = SentenceTransformer("dpanea/skill-assignment-transformer")
# Run inference
sentences = [
'What is the artefact? This is catter this affect is called a cold Almond What are the features of the artefact? The features of this artefact are it looks like a gold snake with inscribed writing on the inside Question 2 What aspect of Ancient Roman society does this artefact represent? This aspect represents the 1st century AD What does the artefact tell us about Ancient Roman Society? This artefact tells us about the 1st century AD The plantations keep the stones gave them girls How does this artefact give us an understanding about Ancient Roman society? it gives us an understanding about Ancient Roman society, because the plantations keep slaves and gift the stones. and forced them to wear it',
'Writing Convention Skills: Conventions of Writing',
'Sentence Construction Skills: I can construct basic sentences',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]
Training Details
Training Dataset
Unnamed Dataset
- Size: 11,779 training samples
- Columns:
sentence_0
,sentence_1
, andsentence_2
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 sentence_2 type string string string details - min: 124 tokens
- mean: 615.96 tokens
- max: 1566 tokens
- min: 7 tokens
- mean: 19.72 tokens
- max: 69 tokens
- min: 6 tokens
- mean: 19.55 tokens
- max: 53 tokens
- Samples:
sentence_0 sentence_1 sentence_2 2024 POETRY FEATURE ARTICLE – SCAFFOLD - blank
Name:
Song Chosen: SET IT ALL FREE
Poem Chosen: STILL, I RISE
Common theme: These form together to give the message of overcoming challenges and rising above difficulties with confidence and strength.
]
THIS Scaffold could be submitted as your draft.
HEADLINE: It needs to be strong, catchy and stimulate the reader. Try for ‘ear appeal’ or ‘brain appeal’ if you can. Possibly use alliteration or a pun. Just use the title of your poem until you can think of a title for the article. FOCUS BLUB: A brief, gripping sentence or two that lets readers know more specifically what the article is about. It gives a sense of the style of your piece. / Voiceworks - Whispers Of Wisdom Discover the themes of resilience and empowerment in Scarlett Johanssons “set it all Free” and mya Angelou’s “still I rise” I will explore how these works help us to overcome adversity and embrace our true strength...Emotionally Engaging Language: I can evoke an emotional response through emotive language.
Reference Formatting Skills: Formats the reference list/bibliography correctly.
Why is there no fuel for the next 500 kilometers? We need fuel and there is no way to turn back.This is such a bad time.We need fuel and i am gonna rage quit and drive us off the bridge if we can't get fuel any time soon pull over it's my turn, to drive you have been driving for the last hour and i want t go speeding, down this hill and get to the fuel station quicker, you drive way to slow and it is annoying me.Ok fine i'm pulling over.Finally ok i see that red car coming ,he wants to race and im racing him.ya i beat him but now we only have enough fuel for the next 200 km and the next fuel station is 250 km away i will drive until we run out of fuel then we will have to push and i'm paying for the fuel don't even think about paying for the fuel little brother.Ok time to push.No i am not pushing the car and you can not make me just because u are 1 year older than me does no mean can boss me around.Fine i will push lazy boy.What Why is the gas station shut down and the next one is 300k...
Essay Organization Skills: Essay Writing
Case Evaluation Skills: Does the student include discerning evaluation of ideas to support their case for positive change?
What is the artefact? the artefact is a gold armband. What are the features of the artefacts? the features on the arte fact it's a gold amband it looks like it beendigging to look like a snake rap around ur arm. you can see the snake scale's and and snake head on the amberd. Question 2 What aspect of Ancient Roman society does this artefact represent? the artefacts represent partion partion partian partian head tate were the richest people in human society it tells us that partions were the richest people in Aome Home society. patients were on of social What does the artefact tell us about Ancient Roman society? pyramid. they had all theexpertsn suf and they had Slaves How does this artefact give us an understanding about Ancient Roman society? the artefact gives us a understanding their were rich people and Cparthers) they had a late more money then all the others people in home society. 7
Spelling Visuals: Spelling visual - 4
Event Setting Visualization Skills: I can use technical vocabulary, contemporary language and images to create a sense of the event and the setting
- Loss:
TripletLoss
with these parameters:{ "distance_metric": "TripletDistanceMetric.EUCLIDEAN", "triplet_margin": 5 }
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 4per_device_eval_batch_size
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 4per_device_eval_batch_size
: 4per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 3max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsehub_revision
: Nonegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseliger_kernel_config
: Noneeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss |
---|---|---|
0.0340 | 100 | - |
0.0679 | 200 | - |
0.1019 | 300 | - |
0.1358 | 400 | - |
0.1698 | 500 | 1.7346 |
0.2037 | 600 | - |
0.2377 | 700 | - |
0.2716 | 800 | - |
0.3056 | 900 | - |
0.3396 | 1000 | 0.8428 |
0.3735 | 1100 | - |
0.4075 | 1200 | - |
0.4414 | 1300 | - |
0.4754 | 1400 | - |
0.5093 | 1500 | 0.4421 |
0.5433 | 1600 | - |
0.5772 | 1700 | - |
0.6112 | 1800 | - |
0.6452 | 1900 | - |
0.6791 | 2000 | 0.3366 |
0.7131 | 2100 | - |
0.7470 | 2200 | - |
0.7810 | 2300 | - |
0.8149 | 2400 | - |
0.8489 | 2500 | 0.2568 |
0.8829 | 2600 | - |
0.9168 | 2700 | - |
0.9508 | 2800 | - |
0.9847 | 2900 | - |
1.0 | 2945 | - |
1.0187 | 3000 | 0.1666 |
1.0526 | 3100 | - |
1.0866 | 3200 | - |
1.1205 | 3300 | - |
1.1545 | 3400 | - |
1.1885 | 3500 | 0.1027 |
1.2224 | 3600 | - |
1.2564 | 3700 | - |
1.2903 | 3800 | - |
1.3243 | 3900 | - |
1.3582 | 4000 | 0.0657 |
1.3922 | 4100 | - |
1.4261 | 4200 | - |
1.4601 | 4300 | - |
1.4941 | 4400 | - |
1.5280 | 4500 | 0.0788 |
1.5620 | 4600 | - |
1.5959 | 4700 | - |
1.6299 | 4800 | - |
1.6638 | 4900 | - |
1.6978 | 5000 | 0.0648 |
1.7317 | 5100 | - |
1.7657 | 5200 | - |
1.7997 | 5300 | - |
1.8336 | 5400 | - |
1.8676 | 5500 | 0.0413 |
1.9015 | 5600 | - |
1.9355 | 5700 | - |
1.9694 | 5800 | - |
2.0 | 5890 | - |
2.0034 | 5900 | - |
2.0374 | 6000 | 0.0293 |
2.0713 | 6100 | - |
2.1053 | 6200 | - |
2.1392 | 6300 | - |
2.1732 | 6400 | - |
2.2071 | 6500 | 0.0158 |
2.2411 | 6600 | - |
2.2750 | 6700 | - |
2.3090 | 6800 | - |
2.3430 | 6900 | - |
2.3769 | 7000 | 0.0183 |
2.4109 | 7100 | - |
2.4448 | 7200 | - |
2.4788 | 7300 | - |
2.5127 | 7400 | - |
2.5467 | 7500 | 0.0079 |
2.5806 | 7600 | - |
2.6146 | 7700 | - |
2.6486 | 7800 | - |
2.6825 | 7900 | - |
2.7165 | 8000 | 0.007 |
2.7504 | 8100 | - |
2.7844 | 8200 | - |
2.8183 | 8300 | - |
2.8523 | 8400 | - |
2.8862 | 8500 | 0.0057 |
2.9202 | 8600 | - |
2.9542 | 8700 | - |
2.9881 | 8800 | - |
3.0 | 8835 | - |
Framework Versions
- Python: 3.10.12
- Sentence Transformers: 4.1.0
- Transformers: 4.53.0
- PyTorch: 2.1.0+cu118
- Accelerate: 1.8.1
- Datasets: 3.6.0
- Tokenizers: 0.21.2
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}
TripletLoss
@misc{hermans2017defense,
title={In Defense of the Triplet Loss for Person Re-Identification},
author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
year={2017},
eprint={1703.07737},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
- Downloads last month
- 0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for dpanea/skill-assignment-transformer
Base model
Alibaba-NLP/gte-large-en-v1.5