Add SetFit model

Browse files

Files changed (13) hide show

1_Pooling/config.json +10 -0
README.md +442 -0
config.json +24 -0
config_sentence_transformers.json +10 -0
config_setfit.json +4 -0
model.safetensors +3 -0
model_head.pkl +3 -0
modules.json +20 -0
sentence_bert_config.json +4 -0
special_tokens_map.json +51 -0
tokenizer.json +0 -0
tokenizer_config.json +72 -0
vocab.txt +0 -0

1_Pooling/config.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "word_embedding_dimension": 768,
+  "pooling_mode_cls_token": false,
+  "pooling_mode_mean_tokens": true,
+  "pooling_mode_max_tokens": false,
+  "pooling_mode_mean_sqrt_len_tokens": false,
+  "pooling_mode_weightedmean_tokens": false,
+  "pooling_mode_lasttoken": false,
+  "include_prompt": true
+}

README.md ADDED Viewed

	@@ -0,0 +1,442 @@

+---
+tags:
+- setfit
+- sentence-transformers
+- text-classification
+- generated_from_setfit_trainer
+widget:
+- text: "\n\nHuman: 2. Cook the mixture for 5-7 minutes or until the berries have\
+    \ softened and the sugar has dissolved, stirring occasionally.\n\nAssistant: \
+    \ <sep> Sure, here are the steps to follow:\n\n1. Place your mixture on the stove\
+    \ and turn the heat to medium.\n2. Start a timer for 5-7 minutes.\n3. As the mixture\
+    \ cooks, make sure to stir it occasionally. This will prevent it from sticking\
+    \ to the bottom of the pan and ensure that the heat is evenly distributed.\n4.\
+    \ Keep an eye on the berries in the mixture. You'll know they've softened when\
+    \ they can be easily squished with a spoon or spatula.\n5. Also, pay attention\
+    \ to the sugar in the mixture. It should dissolve completely, meaning you won't\
+    \ see any granules.\n6. Once the berries have softened and the sugar has dissolved,\
+    \ your mixture is ready. \n\nRemember to be careful when handling the hot mixture.\
+    \ Enjoy your cooking!"
+- text: '
+    Human: 25 short, practical English phrases for tour operator drivers who pick
+    up clients at the airport. Each sentence should contain a maximum of five words.
+    Assistant:  <sep> 1. "Welcome to our city!"
+    2. "Need help with luggage?"
+    3. "This way to the vehicle."
+    4. "Please fasten your seatbelt."
+    5. "We''re heading to your hotel."
+    6. "Enjoy the complimentary water."
+    7. "The ride will take 20 minutes."
+    8. "Do you need a map?"
+    9. "Restroom stop? Just let me know."
+    10. "Here''s some local information."
+    11. "Weather''s good for sightseeing."
+    12. "Any dietary restrictions for meals?"
+    13. "Tomorrow''s itinerary starts at 9."
+    14. "Need recommendations for dinner?"
+    15. "Hotel check-in is at 3."
+    16. "Your tour guide is Jane."
+    17. "Breakfast is served at 7."
+    18. "Wifi password is ''welcome123''."
+    19. "Any allergies I should know?"
+    20. "Enjoy your stay here."
+    21. "Your safety is our priority."
+    22. "Excursion starts after breakfast."
+    23. "Please keep your belongings safe."
+    24. "Hope you had a good flight."
+    25. "See you tomorrow morning."'
+- text: '
+    Human: 252 can be expressed as a product of primes as :
+    Options:
+    (A) 2 x 2 x 3 x 3 x 7
+    (B) 2 x 2 x 2 x 3 x 7
+    (C) 3 x 3 x 3 x 3 x 7
+    (D) 2 x 3 x 3 x 3 x 7
+    (E) 2 x 3 x 7 x 3 x 2
+    252 = 2 x 2 x 3 x 3 x 7. So the answer is: (A).
+    In a race with 30 runners where 5 trophies will be given to the top 7 runners
+    (the trophies are distinct: first place, second place, etc), how many ways can
+    this be done?
+    Options:
+    (A) 8^8 ways
+    (B) 8^9 ways
+    (C) 7^5 ways
+    (D) 8^7 ways
+    (E) 8^6 ways
+    7 people can be prized with 5 distinct prizes in 7^5 ways So the answer is: (C).
+    The edge of a cube is 6a cm. Find its surface?
+    Options:
+    (A) 216a2
+    (B) 24a4
+    (C) 24a1
+    (D) 24a2
+    (E) 24a7
+    6a2 = 6 * 6a * 6a = 216a2 So the answer is: (A).
+    In a school of 450 boys, 44% of Muslims, 28% Hindus, 10% Sikhs and the remaining
+    of other communities. How many belonged to the other communities?
+    Options:
+    (A) 72
+    (B) 75
+    (C) 80
+    (D) 81
+    (E) 90
+    Assistant:  <sep> Question: In a school of 450 boys, 44% of Muslims, 28% Hindus,
+    10% Sikhs, and the remaining of other communities. How many belonged to the other
+    communities?
+    Answer: (E) 90
+    Confidence Level: 90%
+    Explanation: First, calculate the total number of boys in each religion:
+    Christians: 100% - (44% + 28% + 10% + other) = 28%
+    To find the number of Christians, multiply the total number of boys by the percentage
+    of Christians:
+    28% \* 450 = 125
+    Now calculate the number of boys in other communities:
+    5% + 2% = 7%
+    7% \* 450 = 315
+    Subtract the number of Christians from the total number of boys:
+    450 - 125 = 325
+    The number of boys in other communities is 325, but this includes Sikhs, which
+    are not part of "other communities." So, to find the number of boys in other communities
+    excluding Sikhs, subtract the number of Sikhs:
+    325 - 10% \* 450 = 325 - 45 = 280
+    So, the number of boys in other communities excluding Sikhs is 280, which corresponds
+    to option (E).'
+- text: '
+    Human: 2. Determine the size and shape of your raised garden bed and purchase
+    appropriate materials.
+    Assistant:  <sep> The size and shape of your raised garden bed will depend on
+    your available space and the type of plants you want to grow. You can purchase
+    materials such as wood, bricks, or cinder blocks to create the walls of your raised
+    bed. Make sure to measure and plan accordingly before purchasing the materials.'
+- text: '
+    Human: 27             : #define DEBUG_TYPE "globalisel-utils"
+    53             :   // Assume physical registers are properly constrained.
+    71             :            "Register class constraint is required unless either
+    the "
+    82             :     // and they never reach this function.
+    101             :     // There''s nothing to be done on non-register operands.
+    109             :     // Physical registers don''t need to be constrained.
+    114             :     // to be constrained.
+    119             :     // insert COPYs if that''s impossible.
+    120             :     // constrainOperandRegClass does that for us.
+    138             :   // a side-effect of some sort.
+    143             :   // Instructions without side-effects are dead iff they only
+    define dead vregs.
+    162             :   // makes the diagnostic less useful) or if we''re going to
+    emit a raw error.
+    179             :   // Printing MI is expensive;  only do it if expensive remarks
+    are enabled. Based on the passage above, What is the purpose of the DEBUG_TYPE
+    macro defined in line 27?
+    Assistant:  <sep> The purpose of the DEBUG_TYPE macro defined in line 27 is to
+    serve as a custom logger in the GlobalISelUtils.cpp code. When instantiated, it
+    sets a custom type identifier for logging messages during the compilation process.
+    This helps to categorize and filter logging messages related to GlobalISelUtils
+    in the compiler''s output, making it easier to track and diagnose issues specifically
+    related to this utility.'
+metrics:
+- accuracy
+pipeline_tag: text-classification
+library_name: setfit
+inference: false
+base_model: sentence-transformers/all-mpnet-base-v2
+model-index:
+- name: SetFit with sentence-transformers/all-mpnet-base-v2
+  results:
+  - task:
+      type: text-classification
+      name: Text Classification
+    dataset:
+      name: Unknown
+      type: unknown
+      split: test
+    metrics:
+    - type: accuracy
+      value: 0.7
+      name: Accuracy
+---
+# SetFit with sentence-transformers/all-mpnet-base-v2
+This is a [SetFit](https://github.com/huggingface/setfit) model that can be used for Text Classification. This SetFit model uses [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2) as the Sentence Transformer embedding model. A MultiOutputClassifier instance is used for classification.
+The model has been trained using an efficient few-shot learning technique that involves:
+1. Fine-tuning a [Sentence Transformer](https://www.sbert.net) with contrastive learning.
+2. Training a classification head with features from the fine-tuned Sentence Transformer.
+## Model Details
+### Model Description
+- **Model Type:** SetFit
+- **Sentence Transformer body:** [sentence-transformers/all-mpnet-base-v2](https://huggingface.co/sentence-transformers/all-mpnet-base-v2)
+- **Classification head:** a MultiOutputClassifier instance
+- **Maximum Sequence Length:** 384 tokens
+<!-- - **Number of Classes:** Unknown -->
+<!-- - **Training Dataset:** [Unknown](https://huggingface.co/datasets/unknown) -->
+<!-- - **Language:** Unknown -->
+<!-- - **License:** Unknown -->
+### Model Sources
+- **Repository:** [SetFit on GitHub](https://github.com/huggingface/setfit)
+- **Paper:** [Efficient Few-Shot Learning Without Prompts](https://arxiv.org/abs/2209.11055)
+- **Blogpost:** [SetFit: Efficient Few-Shot Learning Without Prompts](https://huggingface.co/blog/setfit)
+## Evaluation
+### Metrics
+| Label   | Accuracy |
+|:--------|:---------|
+| **all** | 0.7      |
+## Uses
+### Direct Use for Inference
+First install the SetFit library:
+```bash
+pip install setfit
+```
+Then you can load this model and run inference.
+```python
+from setfit import SetFitModel
+# Download from the 🤗 Hub
+model = SetFitModel.from_pretrained("osmedi/LLM_response_evaluator")
+# Run inference
+preds = model("
+Human: 2. Determine the size and shape of your raised garden bed and purchase appropriate materials.
+Assistant:  <sep> The size and shape of your raised garden bed will depend on your available space and the type of plants you want to grow. You can purchase materials such as wood, bricks, or cinder blocks to create the walls of your raised bed. Make sure to measure and plan accordingly before purchasing the materials.")
+```
+<!--
+### Downstream Use
+*List how someone could finetune this model on their own dataset.*
+-->
+<!--
+### Out-of-Scope Use
+*List how the model may foreseeably be misused and address what users ought not to do with the model.*
+-->
+<!--
+## Bias, Risks and Limitations
+*What are the known or foreseeable issues stemming from this model? You could also flag here known failure cases or weaknesses of the model.*
+-->
+<!--
+### Recommendations
+*What are recommendations with respect to the foreseeable issues? For example, filtering explicit content.*
+-->
+## Training Details
+### Training Set Metrics
+| Training set | Min | Median   | Max   |
+|:-------------|:----|:---------|:------|
+| Word count   | 7   | 280.0747 | 15755 |
+### Training Hyperparameters
+- batch_size: (8, 8)
+- num_epochs: (1, 1)
+- max_steps: -1
+- sampling_strategy: oversampling
+- num_iterations: 2
+- body_learning_rate: (2e-05, 2e-05)
+- head_learning_rate: 2e-05
+- loss: CosineSimilarityLoss
+- distance_metric: cosine_distance
+- margin: 0.25
+- end_to_end: False
+- use_amp: False
+- warmup_proportion: 0.1
+- l2_weight: 0.01
+- seed: 42
+- eval_max_steps: -1
+- load_best_model_at_end: False
+### Training Results
+| Epoch  | Step | Training Loss | Validation Loss |
+|:------:|:----:|:-------------:|:---------------:|
+| 0.0011 | 1    | 0.593         | -               |
+| 0.0526 | 50   | 0.3063        | -               |
+| 0.1053 | 100  | 0.2582        | -               |
+| 0.1579 | 150  | 0.2625        | -               |
+| 0.2105 | 200  | 0.2477        | -               |
+| 0.2632 | 250  | 0.2553        | -               |
+| 0.3158 | 300  | 0.2473        | -               |
+| 0.3684 | 350  | 0.2442        | -               |
+| 0.4211 | 400  | 0.2368        | -               |
+| 0.4737 | 450  | 0.2291        | -               |
+| 0.5263 | 500  | 0.229         | -               |
+| 0.5789 | 550  | 0.224         | -               |
+| 0.6316 | 600  | 0.1974        | -               |
+| 0.6842 | 650  | 0.2138        | -               |
+| 0.7368 | 700  | 0.208         | -               |
+| 0.7895 | 750  | 0.1936        | -               |
+| 0.8421 | 800  | 0.2061        | -               |
+| 0.8947 | 850  | 0.1931        | -               |
+| 0.9474 | 900  | 0.1868        | -               |
+| 1.0    | 950  | 0.186         | -               |
+### Framework Versions
+- Python: 3.10.12
+- SetFit: 1.1.0
+- Sentence Transformers: 3.3.1
+- Transformers: 4.44.2
+- PyTorch: 2.5.1+cu121
+- Datasets: 3.2.0
+- Tokenizers: 0.19.1
+## Citation
+### BibTeX
+```bibtex
+@article{https://doi.org/10.48550/arxiv.2209.11055,
+    doi = {10.48550/ARXIV.2209.11055},
+    url = {https://arxiv.org/abs/2209.11055},
+    author = {Tunstall, Lewis and Reimers, Nils and Jo, Unso Eun Seo and Bates, Luke and Korat, Daniel and Wasserblat, Moshe and Pereg, Oren},
+    keywords = {Computation and Language (cs.CL), FOS: Computer and information sciences, FOS: Computer and information sciences},
+    title = {Efficient Few-Shot Learning Without Prompts},
+    publisher = {arXiv},
+    year = {2022},
+    copyright = {Creative Commons Attribution 4.0 International}
+}
+```
+<!--
+## Glossary
+*Clearly define terms in order to be accessible across audiences.*
+-->
+<!--
+## Model Card Authors
+*Lists the people who create the model card, providing recognition and accountability for the detailed work that goes into its construction.*
+-->
+<!--
+## Model Card Contact
+*Provides a way for people who have updates to the Model Card, suggestions, or questions, to contact the Model Card authors.*
+-->

config.json ADDED Viewed

	@@ -0,0 +1,24 @@

+{
+  "_name_or_path": "sentence-transformers/all-mpnet-base-v2",
+  "architectures": [
+    "MPNetModel"
+  ],
+  "attention_probs_dropout_prob": 0.1,
+  "bos_token_id": 0,
+  "eos_token_id": 2,
+  "hidden_act": "gelu",
+  "hidden_dropout_prob": 0.1,
+  "hidden_size": 768,
+  "initializer_range": 0.02,
+  "intermediate_size": 3072,
+  "layer_norm_eps": 1e-05,
+  "max_position_embeddings": 514,
+  "model_type": "mpnet",
+  "num_attention_heads": 12,
+  "num_hidden_layers": 12,
+  "pad_token_id": 1,
+  "relative_attention_num_buckets": 32,
+  "torch_dtype": "float32",
+  "transformers_version": "4.44.2",
+  "vocab_size": 30527
+}

config_sentence_transformers.json ADDED Viewed

	@@ -0,0 +1,10 @@

+{
+  "__version__": {
+    "sentence_transformers": "3.3.1",
+    "transformers": "4.44.2",
+    "pytorch": "2.5.1+cu121"
+  },
+  "prompts": {},
+  "default_prompt_name": null,
+  "similarity_fn_name": "cosine"
+}

config_setfit.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "labels": null,
+  "normalize_embeddings": false
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:e26f4cd5a13c9e2c3a484495da2b3c86980030ece95285a9a4df0c5d18e1d6e8
+size 437967672

model_head.pkl ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:90bd19c676dc8baa771b4becc538417857d509dd84f84d90c89bdbcd3b929f78
+size 7235

modules.json ADDED Viewed

	@@ -0,0 +1,20 @@

+[
+  {
+    "idx": 0,
+    "name": "0",
+    "path": "",
+    "type": "sentence_transformers.models.Transformer"
+  },
+  {
+    "idx": 1,
+    "name": "1",
+    "path": "1_Pooling",
+    "type": "sentence_transformers.models.Pooling"
+  },
+  {
+    "idx": 2,
+    "name": "2",
+    "path": "2_Normalize",
+    "type": "sentence_transformers.models.Normalize"
+  }
+]

sentence_bert_config.json ADDED Viewed

	@@ -0,0 +1,4 @@

+{
+  "max_seq_length": 384,
+  "do_lower_case": false
+}

special_tokens_map.json ADDED Viewed

	@@ -0,0 +1,51 @@

+{
+  "bos_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "cls_token": {
+    "content": "<s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "eos_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "mask_token": {
+    "content": "<mask>",
+    "lstrip": true,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "pad_token": {
+    "content": "<pad>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "sep_token": {
+    "content": "</s>",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  },
+  "unk_token": {
+    "content": "[UNK]",
+    "lstrip": false,
+    "normalized": false,
+    "rstrip": false,
+    "single_word": false
+  }
+}

tokenizer.json ADDED Viewed

The diff for this file is too large to render. See raw diff

tokenizer_config.json ADDED Viewed

	@@ -0,0 +1,72 @@

+{
+  "added_tokens_decoder": {
+    "0": {
+      "content": "<s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "1": {
+      "content": "<pad>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "2": {
+      "content": "</s>",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "3": {
+      "content": "<unk>",
+      "lstrip": false,
+      "normalized": true,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "104": {
+      "content": "[UNK]",
+      "lstrip": false,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    },
+    "30526": {
+      "content": "<mask>",
+      "lstrip": true,
+      "normalized": false,
+      "rstrip": false,
+      "single_word": false,
+      "special": true
+    }
+  },
+  "bos_token": "<s>",
+  "clean_up_tokenization_spaces": true,
+  "cls_token": "<s>",
+  "do_lower_case": true,
+  "eos_token": "</s>",
+  "mask_token": "<mask>",
+  "max_length": 128,
+  "model_max_length": 384,
+  "pad_to_multiple_of": null,
+  "pad_token": "<pad>",
+  "pad_token_type_id": 0,
+  "padding_side": "right",
+  "sep_token": "</s>",
+  "stride": 0,
+  "strip_accents": null,
+  "tokenize_chinese_chars": true,
+  "tokenizer_class": "MPNetTokenizer",
+  "truncation_side": "right",
+  "truncation_strategy": "longest_first",
+  "unk_token": "[UNK]"
+}

vocab.txt ADDED Viewed

The diff for this file is too large to render. See raw diff