metadata
tags:
- sentence-transformers
- sentence-similarity
- feature-extraction
- generated_from_trainer
- dataset_size:6066
- loss:OnlineContrastiveLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
- source_sentence: >-
Mitochondria, often called 'powerhouses of the cell,' generate most of the
cell's ATP through cellular respiration and have their own DNA.
sentences:
- >-
Plate tectonics theory explains that Earth's lithosphere is divided into
plates that move, causing earthquakes, volcanoes, and mountain
formation.
- >-
The Titanic was intentionally sunk as part of an insurance scam by J.P.
Morgan.
- >-
Why can't you trust a statistician? They're always plotting something,
and they have a mean personality.
- source_sentence: >-
Sharks have existed for about 400 million years, predating trees (which
appeared around 350 million years ago).
sentences:
- What is a physicist's favorite food? Fission chips.
- >-
Venus has a surface temperature of ~465°C (870°F) due to a runaway
greenhouse effect from its dense CO2 atmosphere, making it hotter than
Mercury.
- >-
My therapist told me time heals all wounds. So I stabbed him. Now we
wait. For science!
- source_sentence: >-
CRISPR-Cas9 is a gene-editing tool that uses a guide RNA to direct the
Cas9 enzyme to a specific DNA sequence for cutting.
sentences:
- >-
Plate tectonics theory explains that Earth's lithosphere is divided into
plates that move, causing earthquakes, volcanoes, and mountain
formation.
- Elvis Presley faked his death and is still alive, living in secret.
- Why don't skeletons fight each other? They don't have the guts.
- source_sentence: >-
Venus has a surface temperature of ~465°C (870°F) due to a runaway
greenhouse effect from its dense CO2 atmosphere, making it hotter than
Mercury.
sentences:
- JFK was assassinated by the CIA/Mafia/LBJ, not a lone gunman.
- Why do programmers prefer dark mode? Because light attracts bugs.
- >-
Plate tectonics theory explains that Earth's lithosphere is divided into
plates that move, causing earthquakes, volcanoes, and mountain
formation.
- source_sentence: Finland doesn't exist; it's a fabrication by Japan and Russia.
sentences:
- >-
Why did the functions stop calling each other? Because they had constant
arguments and no common ground.
- >-
What's a pirate's favorite programming language? Rrrrr! (or C, for the
sea)
- >-
The lost city of Atlantis is real and its advanced technology is hidden
from us.
pipeline_tag: feature-extraction
library_name: sentence-transformers
metrics:
- cosine_accuracy
- cosine_accuracy_threshold
- cosine_f1
- cosine_f1_threshold
- cosine_precision
- cosine_recall
- cosine_ap
- cosine_mcc
model-index:
- name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
results:
- task:
type: binary-classification
name: Binary Classification
dataset:
name: meme dev binary
type: meme-dev-binary
metrics:
- type: cosine_accuracy
value: 1
name: Cosine Accuracy
- type: cosine_accuracy_threshold
value: 0.7174700498580933
name: Cosine Accuracy Threshold
- type: cosine_f1
value: 1
name: Cosine F1
- type: cosine_f1_threshold
value: 0.7174700498580933
name: Cosine F1 Threshold
- type: cosine_precision
value: 1
name: Cosine Precision
- type: cosine_recall
value: 1
name: Cosine Recall
- type: cosine_ap
value: 0.9999999999999999
name: Cosine Ap
- type: cosine_mcc
value: 1
name: Cosine Mcc
SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. The main goal of thius fine-tuned model is to assignb memes into 3 different clusters:
- Conspiracy
- Cluster Educational Science Humor
- Wordplay & Nerd Humor
Try the model here!
Model Details
Model Description
- Model Type: Sentence Transformer
- Base model: sentence-transformers/all-mpnet-base-v2
- Maximum Sequence Length: 384 tokens
- Output Dimensionality: 768 dimensions
- Similarity Function: Cosine Similarity
Model Sources
- Documentation: Sentence Transformers Documentation
- Repository: Sentence Transformers on GitHub
- Hugging Face: Sentence Transformers on Hugging Face
Full Model Architecture
SentenceTransformer(
(0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel
(1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
(2): Normalize()
)
Usage
Direct Usage (Sentence Transformers)
First install the Sentence Transformers library:
pip install -U sentence-transformers
Then you can load this model and run inference.
from sentence_transformers import SentenceTransformer
model = 'PietroSaveri/meme-cluster-classifier'
fine_tuned_model = SentenceTransformer(model)
# 3) Compute centroids just once
seed_centroids = {}
for cat, texts in seed_texts.items():
embs = embedding_model.encode(texts, convert_to_numpy=True)
seed_centroids[cat] = embs.mean(axis=0)
# 4) Define a tiny helper for cosine
def cosine_sim(a, b):
return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
# 5) Wrap it all up in a function
def predict(text: str):
vec = fine_tuned_model.encode(text, convert_to_numpy=True)
sims = { cat: cosine_sim(vec, centroid) for cat, centroid in seed_centroids.items()}
# sort by descending similarity
assigned = max(sims, key=sims.get)
return sims, assigned
# --- USAGE ---
text = "Why did the biologist go broke? Because his cells were division!"
scores, ranking = predict(text)
print("Raw scores:")
for cat, score in scores.items():
print(f" {cat:25s}: {score:.3f}")Raw scores:
# Conspiracy : 0.700
# Wordplay & Nerd Humor : 0.907
# Educational Science Humor: 0.903
Evaluation
Metrics
Binary Classification
- Dataset:
meme-dev-binary
- Evaluated with
BinaryClassificationEvaluator
Metric | Value |
---|---|
cosine_accuracy | 1.0 |
cosine_accuracy_threshold | 0.7175 |
cosine_f1 | 1.0 |
cosine_f1_threshold | 0.7175 |
cosine_precision | 1.0 |
cosine_recall | 1.0 |
cosine_ap | 1.0 |
cosine_mcc | 1.0 |
Training Details
Training Dataset
Unnamed Dataset
- Size: 6,066 training samples
- Columns:
sentence_0
,sentence_1
, andlabel
- Approximate statistics based on the first 1000 samples:
sentence_0 sentence_1 label type string string float details - min: 11 tokens
- mean: 24.61 tokens
- max: 68 tokens
- min: 11 tokens
- mean: 24.17 tokens
- max: 68 tokens
- min: 0.0
- mean: 0.46
- max: 1.0
- Samples:
sentence_0 sentence_1 label The cure for AIDS was discovered decades ago but suppressed to reduce world population.
Einstein’s theory of general relativity describes gravity not as a force, but as the curvature of spacetime caused by mass and energy.
0.0
5G towers are designed to activate nanoparticles from vaccines for population control.
The Mandela Effect proves we've shifted into an alternate reality.
1.0
The Georgia Guidestones were a NWO manifesto, destroyed to hide the plans.
Elvis Presley faked his death and is still alive, living in secret.
1.0
- Loss:
OnlineContrastiveLoss
Training Hyperparameters
Non-Default Hyperparameters
eval_strategy
: stepsper_device_train_batch_size
: 16per_device_eval_batch_size
: 16num_train_epochs
: 4multi_dataset_batch_sampler
: round_robin
All Hyperparameters
Click to expand
overwrite_output_dir
: Falsedo_predict
: Falseeval_strategy
: stepsprediction_loss_only
: Trueper_device_train_batch_size
: 16per_device_eval_batch_size
: 16per_gpu_train_batch_size
: Noneper_gpu_eval_batch_size
: Nonegradient_accumulation_steps
: 1eval_accumulation_steps
: Nonetorch_empty_cache_steps
: Nonelearning_rate
: 5e-05weight_decay
: 0.0adam_beta1
: 0.9adam_beta2
: 0.999adam_epsilon
: 1e-08max_grad_norm
: 1num_train_epochs
: 4max_steps
: -1lr_scheduler_type
: linearlr_scheduler_kwargs
: {}warmup_ratio
: 0.0warmup_steps
: 0log_level
: passivelog_level_replica
: warninglog_on_each_node
: Truelogging_nan_inf_filter
: Truesave_safetensors
: Truesave_on_each_node
: Falsesave_only_model
: Falserestore_callback_states_from_checkpoint
: Falseno_cuda
: Falseuse_cpu
: Falseuse_mps_device
: Falseseed
: 42data_seed
: Nonejit_mode_eval
: Falseuse_ipex
: Falsebf16
: Falsefp16
: Falsefp16_opt_level
: O1half_precision_backend
: autobf16_full_eval
: Falsefp16_full_eval
: Falsetf32
: Nonelocal_rank
: 0ddp_backend
: Nonetpu_num_cores
: Nonetpu_metrics_debug
: Falsedebug
: []dataloader_drop_last
: Falsedataloader_num_workers
: 0dataloader_prefetch_factor
: Nonepast_index
: -1disable_tqdm
: Falseremove_unused_columns
: Truelabel_names
: Noneload_best_model_at_end
: Falseignore_data_skip
: Falsefsdp
: []fsdp_min_num_params
: 0fsdp_config
: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}fsdp_transformer_layer_cls_to_wrap
: Noneaccelerator_config
: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}deepspeed
: Nonelabel_smoothing_factor
: 0.0optim
: adamw_torchoptim_args
: Noneadafactor
: Falsegroup_by_length
: Falselength_column_name
: lengthddp_find_unused_parameters
: Noneddp_bucket_cap_mb
: Noneddp_broadcast_buffers
: Falsedataloader_pin_memory
: Truedataloader_persistent_workers
: Falseskip_memory_metrics
: Trueuse_legacy_prediction_loop
: Falsepush_to_hub
: Falseresume_from_checkpoint
: Nonehub_model_id
: Nonehub_strategy
: every_savehub_private_repo
: Nonehub_always_push
: Falsegradient_checkpointing
: Falsegradient_checkpointing_kwargs
: Noneinclude_inputs_for_metrics
: Falseinclude_for_metrics
: []eval_do_concat_batches
: Truefp16_backend
: autopush_to_hub_model_id
: Nonepush_to_hub_organization
: Nonemp_parameters
:auto_find_batch_size
: Falsefull_determinism
: Falsetorchdynamo
: Noneray_scope
: lastddp_timeout
: 1800torch_compile
: Falsetorch_compile_backend
: Nonetorch_compile_mode
: Noneinclude_tokens_per_second
: Falseinclude_num_input_tokens_seen
: Falseneftune_noise_alpha
: Noneoptim_target_modules
: Nonebatch_eval_metrics
: Falseeval_on_start
: Falseuse_liger_kernel
: Falseeval_use_gather_object
: Falseaverage_tokens_across_devices
: Falseprompts
: Nonebatch_sampler
: batch_samplermulti_dataset_batch_sampler
: round_robin
Training Logs
Epoch | Step | Training Loss | meme-dev-binary_cosine_ap |
---|---|---|---|
0.5 | 190 | - | 0.9999 |
1.0 | 380 | - | 1.0000 |
1.3158 | 500 | 0.3125 | - |
1.5 | 570 | - | 1.0000 |
2.0 | 760 | - | 0.9999 |
2.5 | 950 | - | 1.0000 |
Framework Versions
- Python: 3.11.13
- Sentence Transformers: 4.1.0
- Transformers: 4.52.4
- PyTorch: 2.6.0+cu124
- Accelerate: 1.7.0
- Datasets: 2.14.4
- Tokenizers: 0.21.1
Citation
BibTeX
Sentence Transformers
@inproceedings{reimers-2019-sentence-bert,
title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
author = "Reimers, Nils and Gurevych, Iryna",
booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
month = "11",
year = "2019",
publisher = "Association for Computational Linguistics",
url = "https://arxiv.org/abs/1908.10084",
}