PietroSaveri's picture
Update README.md
506208a verified
metadata
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:6066
  - loss:OnlineContrastiveLoss
base_model: sentence-transformers/all-mpnet-base-v2
widget:
  - source_sentence: >-
      Mitochondria, often called 'powerhouses of the cell,' generate most of the
      cell's ATP through cellular respiration and have their own DNA.
    sentences:
      - >-
        Plate tectonics theory explains that Earth's lithosphere is divided into
        plates that move, causing earthquakes, volcanoes, and mountain
        formation.
      - >-
        The Titanic was intentionally sunk as part of an insurance scam by J.P.
        Morgan.
      - >-
        Why can't you trust a statistician? They're always plotting something,
        and they have a mean personality.
  - source_sentence: >-
      Sharks have existed for about 400 million years, predating trees (which
      appeared around 350 million years ago).
    sentences:
      - What is a physicist's favorite food? Fission chips.
      - >-
        Venus has a surface temperature of ~465°C (870°F) due to a runaway
        greenhouse effect from its dense CO2 atmosphere, making it hotter than
        Mercury.
      - >-
        My therapist told me time heals all wounds. So I stabbed him. Now we
        wait. For science!
  - source_sentence: >-
      CRISPR-Cas9 is a gene-editing tool that uses a guide RNA to direct the
      Cas9 enzyme to a specific DNA sequence for cutting.
    sentences:
      - >-
        Plate tectonics theory explains that Earth's lithosphere is divided into
        plates that move, causing earthquakes, volcanoes, and mountain
        formation.
      - Elvis Presley faked his death and is still alive, living in secret.
      - Why don't skeletons fight each other? They don't have the guts.
  - source_sentence: >-
      Venus has a surface temperature of ~465°C (870°F) due to a runaway
      greenhouse effect from its dense CO2 atmosphere, making it hotter than
      Mercury.
    sentences:
      - JFK was assassinated by the CIA/Mafia/LBJ, not a lone gunman.
      - Why do programmers prefer dark mode? Because light attracts bugs.
      - >-
        Plate tectonics theory explains that Earth's lithosphere is divided into
        plates that move, causing earthquakes, volcanoes, and mountain
        formation.
  - source_sentence: Finland doesn't exist; it's a fabrication by Japan and Russia.
    sentences:
      - >-
        Why did the functions stop calling each other? Because they had constant
        arguments and no common ground.
      - >-
        What's a pirate's favorite programming language? Rrrrr! (or C, for the
        sea)
      - >-
        The lost city of Atlantis is real and its advanced technology is hidden
        from us.
pipeline_tag: feature-extraction
library_name: sentence-transformers
metrics:
  - cosine_accuracy
  - cosine_accuracy_threshold
  - cosine_f1
  - cosine_f1_threshold
  - cosine_precision
  - cosine_recall
  - cosine_ap
  - cosine_mcc
model-index:
  - name: SentenceTransformer based on sentence-transformers/all-mpnet-base-v2
    results:
      - task:
          type: binary-classification
          name: Binary Classification
        dataset:
          name: meme dev binary
          type: meme-dev-binary
        metrics:
          - type: cosine_accuracy
            value: 1
            name: Cosine Accuracy
          - type: cosine_accuracy_threshold
            value: 0.7174700498580933
            name: Cosine Accuracy Threshold
          - type: cosine_f1
            value: 1
            name: Cosine F1
          - type: cosine_f1_threshold
            value: 0.7174700498580933
            name: Cosine F1 Threshold
          - type: cosine_precision
            value: 1
            name: Cosine Precision
          - type: cosine_recall
            value: 1
            name: Cosine Recall
          - type: cosine_ap
            value: 0.9999999999999999
            name: Cosine Ap
          - type: cosine_mcc
            value: 1
            name: Cosine Mcc

SentenceTransformer based on sentence-transformers/all-mpnet-base-v2

This is a sentence-transformers model finetuned from sentence-transformers/all-mpnet-base-v2. The main goal of thius fine-tuned model is to assignb memes into 3 different clusters:

  • Conspiracy
  • Cluster Educational Science Humor
  • Wordplay & Nerd Humor

Try the model here!

Spaces meme-cluster!

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-mpnet-base-v2
  • Maximum Sequence Length: 384 tokens
  • Output Dimensionality: 768 dimensions
  • Similarity Function: Cosine Similarity

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 384, 'do_lower_case': False}) with Transformer model: MPNetModel 
  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

model = 'PietroSaveri/meme-cluster-classifier'
fine_tuned_model = SentenceTransformer(model)

# 3) Compute centroids just once
seed_centroids = {}
for cat, texts in seed_texts.items():
    embs = embedding_model.encode(texts, convert_to_numpy=True)
    seed_centroids[cat] = embs.mean(axis=0)

# 4) Define a tiny helper for cosine
def cosine_sim(a, b):
    return np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))

# 5) Wrap it all up in a function
def predict(text: str):
    vec = fine_tuned_model.encode(text, convert_to_numpy=True)
    sims = { cat: cosine_sim(vec, centroid) for cat, centroid in seed_centroids.items()}
    # sort by descending similarity
    assigned = max(sims, key=sims.get)
    return sims, assigned


# --- USAGE ---
text = "Why did the biologist go broke? Because his cells were division!"
scores, ranking = predict(text)

print("Raw scores:")
for cat, score in scores.items():
    print(f"  {cat:25s}: {score:.3f}")Raw scores:

# Conspiracy               : 0.700
# Wordplay & Nerd Humor    : 0.907
# Educational Science Humor: 0.903

Evaluation

Metrics

Binary Classification

Metric Value
cosine_accuracy 1.0
cosine_accuracy_threshold 0.7175
cosine_f1 1.0
cosine_f1_threshold 0.7175
cosine_precision 1.0
cosine_recall 1.0
cosine_ap 1.0
cosine_mcc 1.0

Training Details

Training Dataset

Unnamed Dataset

  • Size: 6,066 training samples
  • Columns: sentence_0, sentence_1, and label
  • Approximate statistics based on the first 1000 samples:
    sentence_0 sentence_1 label
    type string string float
    details
    • min: 11 tokens
    • mean: 24.61 tokens
    • max: 68 tokens
    • min: 11 tokens
    • mean: 24.17 tokens
    • max: 68 tokens
    • min: 0.0
    • mean: 0.46
    • max: 1.0
  • Samples:
    sentence_0 sentence_1 label
    The cure for AIDS was discovered decades ago but suppressed to reduce world population. Einstein’s theory of general relativity describes gravity not as a force, but as the curvature of spacetime caused by mass and energy. 0.0
    5G towers are designed to activate nanoparticles from vaccines for population control. The Mandela Effect proves we've shifted into an alternate reality. 1.0
    The Georgia Guidestones were a NWO manifesto, destroyed to hide the plans. Elvis Presley faked his death and is still alive, living in secret. 1.0
  • Loss: OnlineContrastiveLoss

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • num_train_epochs: 4
  • multi_dataset_batch_sampler: round_robin

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 16
  • per_device_eval_batch_size: 16
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • torch_empty_cache_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1
  • num_train_epochs: 4
  • max_steps: -1
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.0
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: False
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: None
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • include_for_metrics: []
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • use_liger_kernel: False
  • eval_use_gather_object: False
  • average_tokens_across_devices: False
  • prompts: None
  • batch_sampler: batch_sampler
  • multi_dataset_batch_sampler: round_robin

Training Logs

Epoch Step Training Loss meme-dev-binary_cosine_ap
0.5 190 - 0.9999
1.0 380 - 1.0000
1.3158 500 0.3125 -
1.5 570 - 1.0000
2.0 760 - 0.9999
2.5 950 - 1.0000

Framework Versions

  • Python: 3.11.13
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Accelerate: 1.7.0
  • Datasets: 2.14.4
  • Tokenizers: 0.21.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}