SetFit with sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2

This is a SetFit model that can be used for Text Classification. This SetFit model uses sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 as the Sentence Transformer embedding model. A OneVsRestClassifier instance is used for classification.

The model has been trained using an efficient few-shot learning technique that involves:

  1. Fine-tuning a Sentence Transformer with contrastive learning.
  2. Training a classification head with features from the fine-tuned Sentence Transformer.

Model Details

Model Description

Model Sources

Uses

Direct Use for Inference

First install the SetFit library:

pip install setfit

Then you can load this model and run inference.

from setfit import SetFitModel

# Download from the 🤗 Hub
model = SetFitModel.from_pretrained("faodl/model_g20_multilabel")
# Run inference
preds = model("Training infrastructure will be adapted to accommodate 
new	programmes.")

Training Details

Training Set Metrics

Training set Min Median Max
Word count 1 48.9866 1181

Training Hyperparameters

  • batch_size: (16, 16)
  • num_epochs: (1, 1)
  • max_steps: -1
  • sampling_strategy: oversampling
  • num_iterations: 50
  • body_learning_rate: (2e-05, 2e-05)
  • head_learning_rate: 2e-05
  • loss: CosineSimilarityLoss
  • distance_metric: cosine_distance
  • margin: 0.25
  • end_to_end: False
  • use_amp: False
  • warmup_proportion: 0.1
  • l2_weight: 0.01
  • seed: 42
  • eval_max_steps: -1
  • load_best_model_at_end: False

Training Results

Epoch Step Training Loss Validation Loss
0.0002 1 0.2348 -
0.0119 50 0.1747 -
0.0237 100 0.153 -
0.0356 150 0.1314 -
0.0475 200 0.1263 -
0.0593 250 0.1168 -
0.0712 300 0.116 -
0.0831 350 0.098 -
0.0949 400 0.1085 -
0.1068 450 0.0975 -
0.1187 500 0.094 -
0.1305 550 0.082 -
0.1424 600 0.0856 -
0.1543 650 0.0838 -
0.1662 700 0.0762 -
0.1780 750 0.0722 -
0.1899 800 0.0722 -
0.2018 850 0.0634 -
0.2136 900 0.0584 -
0.2255 950 0.0664 -
0.2374 1000 0.0688 -
0.2492 1050 0.0629 -
0.2611 1100 0.0579 -
0.2730 1150 0.0652 -
0.2848 1200 0.0573 -
0.2967 1250 0.0584 -
0.3086 1300 0.0558 -
0.3204 1350 0.0586 -
0.3323 1400 0.0574 -
0.3442 1450 0.0444 -
0.3560 1500 0.0462 -
0.3679 1550 0.0488 -
0.3798 1600 0.0505 -
0.3916 1650 0.0529 -
0.4035 1700 0.0487 -
0.4154 1750 0.0459 -
0.4272 1800 0.0531 -
0.4391 1850 0.0448 -
0.4510 1900 0.0382 -
0.4629 1950 0.0457 -
0.4747 2000 0.0493 -
0.4866 2050 0.0488 -
0.4985 2100 0.049 -
0.5103 2150 0.0495 -
0.5222 2200 0.0402 -
0.5341 2250 0.0493 -
0.5459 2300 0.0496 -
0.5578 2350 0.0438 -
0.5697 2400 0.0361 -
0.5815 2450 0.0428 -
0.5934 2500 0.0419 -
0.6053 2550 0.0416 -
0.6171 2600 0.0338 -
0.6290 2650 0.0397 -
0.6409 2700 0.0385 -
0.6527 2750 0.0285 -
0.6646 2800 0.0461 -
0.6765 2850 0.0341 -
0.6883 2900 0.0379 -
0.7002 2950 0.0435 -
0.7121 3000 0.0341 -
0.7239 3050 0.0395 -
0.7358 3100 0.0424 -
0.7477 3150 0.0415 -
0.7596 3200 0.0422 -
0.7714 3250 0.0402 -
0.7833 3300 0.0309 -
0.7952 3350 0.0379 -
0.8070 3400 0.039 -
0.8189 3450 0.0427 -
0.8308 3500 0.0331 -
0.8426 3550 0.0457 -
0.8545 3600 0.0306 -
0.8664 3650 0.034 -
0.8782 3700 0.0354 -
0.8901 3750 0.0393 -
0.9020 3800 0.036 -
0.9138 3850 0.0339 -
0.9257 3900 0.0332 -
0.9376 3950 0.0274 -
0.9494 4000 0.0372 -
0.9613 4050 0.0319 -
0.9732 4100 0.0339 -
0.9850 4150 0.0349 -
0.9969 4200 0.0383 -

Framework Versions

  • Python: 3.11.13
  • SetFit: 1.1.2
  • Sentence Transformers: 4.1.0
  • Transformers: 4.52.4
  • PyTorch: 2.6.0+cu124
  • Datasets: 3.6.0
  • Tokenizers: 0.21.1

Citation

BibTeX

@article{https://doi.org/10.48550/arxiv.2209.11055,
    doi = {10.48550/ARXIV.2209.11055},
    url = {https://arxiv.org/abs/2209.11055},
    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
    title = {Efficient Few-Shot Learning Without Prompts},
    publisher = {arXiv},
    year = {2022},
    copyright = {Creative Commons Attribution 4.0 International}
}
Downloads last month
12
Safetensors
Model size
118M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for faodl/model_g20_multilabel