SentenceTransformer based on sentence-transformers/all-roberta-large-v1

This is a sentence-transformers model finetuned from sentence-transformers/all-roberta-large-v1 on the json dataset. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: sentence-transformers/all-roberta-large-v1
  • Maximum Sequence Length: 256 tokens
  • Output Dimensionality: 1024 tokens
  • Similarity Function: Cosine Similarity
  • Training Dataset:
    • json

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: RobertaModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the πŸ€— Hub
model = SentenceTransformer("huangsukai/finetuned-sentence-encoder-for-pddl-gen")
# Run inference
sentences = [
    "Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.\nQuestion: Here is the task.\nA natural language description of the domain\nDomain information: Help the hero to get out of dungeon! A hero woke up in a dungeon full of monsters and traps (perhaps the party last night went wrong...) and needs your help to get out.  Here are basic facts for the dungeon domain: - The dungeon contains rooms that are **connected** by corridors (dungeon can thus be represented by undirected graph) - each room can be **empty**, or can have a **monster** in it, or can have a **trap** in it, or can have a **sword** in it - one of the empty rooms is the **goal**: it has an exit, so the hero can escape\n\nA list of available predicates\n1. (at-hero ?loc - cells) ;;  Hero's cell location\n2. (at-sword ?s - swords ?loc - cells) ;; Sword cell location\n3. (has-monster ?loc - cells) ;; Indicates if a cell location has a monster\n4. (has-trap ?loc - cells) ;; Indicates if a cell location has a trap\n5. (is-destroyed ?obj) ;; Indicates if a chell or sword has been destroyed\n6. (connected ?from ?to - cells) ;; connects cells\n7. (arm-free) ;; Hero's hand is free\n8. (holding ?s - swords) ;; Hero's holding a sword\n9. (trap-disarmed ?loc) ;; It becomes true when a trap is disarmed\n\nAction Description: **Disarm a trap** – if there is a trap in the room the hero is in and the hero is empty-handed (does not hold a sword), then the hero can disarm it\n\nAction name: disarm-trap\n\n\nYour answer:\n",
    'Parameters:\n1. ?loc - cells\n\nPreconditions:\n```\n(and\n    (at-hero ?loc)\n    (arm-free)\n    (has-trap ?loc)\n)\n```\n\nEffects:\n```\n(and\n    (trap-disarmed ?loc)\n    (not (has-trap ?loc))\n)\n```\n',
    'Parameters:\n1. ?from - cells\n2. ?to - cells\n\nPreconditions:\n```\n(and\n    (connected ?from ?to)\n    (at-hero ?from)\n    (not (has-trap ?from))\n    (not (is-destroyed ?to))\n    (not (has-trap ?to))\n    (not (has-monster ?to))\n)\n```\n\nEffects:\n```\n(and\n    (at-hero ?to)\n    (is-destroyed ?from)\n    (not (at-hero ?from))\n)\n```\n',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

json

  • Dataset: json
  • Size: 200,010 training samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 1000 samples:
    anchor positive negative
    type string string string
    details
    • min: 246 tokens
    • mean: 255.74 tokens
    • max: 256 tokens
    • min: 71 tokens
    • mean: 138.42 tokens
    • max: 248 tokens
    • min: 62 tokens
    • mean: 132.66 tokens
    • max: 250 tokens
  • Samples:
    anchor positive negative
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: This domain is structured to allow organizing and managing books within a library setting. The actions and predicates support the movement of books between tables and shelves, ensuring that conditions like accessibility and the librarian's hands being free are met. Additionally, it includes managing book categories, shelf space, and check-out/return processes to reflect a more complex library system.

    A list of available predicates
    1. (on-shelf ?x ?y - book) ;; ?x is on top of ?y on the shelf
    2. (on-table ?x - book) ;; ?x is on the table
    3. (accessible ?x - book) ;; ?x is accessible (not covered)
    4. (hands-free) ;; The hands of the librarian are free
    5. (holding ?x - book) ;; The librarian is holding ?x
    6. (belongs-to-category ?x - book ?cat - category) ;; ?x belongs to the category ?cat
    7. (shelf-empty ?cat - category) ;; The shelf for category ?cat is empty
    8. (shelf-overflow ?cat - category) ;; The shelf for category ?cat is full
    9. (book-request ?book - book) ;; There is a request for book ?book
    10. (return-due ?book - book) ;; Book ?book is due for return
    11. (checked-out ?book - book) ;; Book ?book is checked out

    Action Description: Mark a book as borrowed by a patron, ensuring it's not already taken.

    Action name: check-out


    Your answer:
    Parameters:
    1. ?x - book

    Preconditions:
    <br>(and<br> (accessible ?x)<br> (not (checked-out ?x))<br>)<br>

    Effects:
    <br>(and<br> (checked-out ?x)<br> (not (accessible ?x))<br> (book-request ?x)<br> (return-due ?x)<br>)<br>
    Parameters:
    1. ?x - book

    Preconditions:
    <br>(and<br> (checked-out ?x)<br> (accessible ?x)<br>)<br>

    Effects:
    <br>(and<br> (checked-out ?x)<br> (book-request ?x)<br> (return-due ?x)<br>)<br>
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: This describes a cooking or baking process where ingredients such as eggs and flour are used. The process of baking involves putting eggs in the pan followed by the flour, mix the two and then put the pan in the oven and then remove the pan from the oven to get the baked cake. Lastly, the pan is cleaned using a soap. Well, you can also bake a souffle but in at bit different process.

    A list of available predicates
    1. (is_egg ?egg - ingredient) ;; the ingredient is an egg
    2. (is_flour ?flour - ingredient) ;; the ingredient is flour
    3. (pan_has_egg ?pan - pan) ;; the pan has an egg
    4. (pan_has_flour ?pan - pan) ;; the pan has flour
    5. (pan_is_clean ?pan - pan) ;; the pan is clean
    6. (pan_in_oven ?pan - pan) ;; the pan is in the oven
    7. (in_pan ?x - ingredient ?pan - pan) ;; the ingredient is in the pan
    8. (in_oven ?pan - pan ?oven - oven) ;; the pan is in the oven
    9. (oven_is_full ?oven - oven) ;; the oven is full
    10. (hypothetical ?new - ingredient) ;; the ingredient is hypothetical
    11. (is_mixed ?pan - pan) ;; the ingredients in the pan are mixed
    12. (is_cake ?new - ingredient) ;; the ingredient is a cake
    13. (is_souffle ?new - ingredient) ;; the ingredient is a souffle
    14. (soap_consumed ?soap - soap) ;; the soap is consumed

    Action Description: Action putting pan in oven needs oven and a pan. The pan is put in oven which is not full.

    Action name: put_pan_in_oven


    Your answer:
    Parameters:
    1. ?pan - pan
    2. ?oven - oven

    Preconditions:
    <br>(and<br> (not (oven_is_full ?oven))<br> (not (pan_in_oven ?pan))<br>)<br>

    Effects:
    <br>(and<br> (oven_is_full ?oven)<br> (in_oven ?pan ?oven)<br> (pan_in_oven ?pan)<br>)<br>
    Parameters:
    1. ?pan - pan
    2. ?oven - oven

    Preconditions:
    <br>(and<br> (not (oven_is_full ?oven))<br> (not (pan_in_oven ?pan))<br>)<br>

    Effects:
    <br>(and<br> (oven_is_full ?oven)<br> (in_oven ?pan ?oven)<br>)<br>
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: Help the hero to get out of dungeon! A hero woke up in a dungeon full of monsters and traps (perhaps the party last night went wrong...) and needs your help to get out. Here are basic facts for the dungeon domain: - The dungeon contains rooms that are connected by corridors (dungeon can thus be represented by undirected graph) - each room can be empty, or can have a monster in it, or can have a trap in it, or can have a sword in it - one of the empty rooms is the goal: it has an exit, so the hero can escape

    A list of available predicates
    1. (at-hero ?loc - cells) ;; Hero's cell location
    2. (at-sword ?s - swords ?loc - cells) ;; Sword cell location
    3. (has-monster ?loc - cells) ;; Indicates if a cell location has a monster
    4. (has-trap ?loc - cells) ;; Indicates if a cell location has a trap
    5. (is-destroyed ?obj) ;; Indicates if a chell or sword has been destroyed
    6. (connected ?from ?to - cells) ;; connects cells
    7. (arm-free) ;; Hero's hand is free
    8. (holding ?s - swords) ;; Hero's holding a sword
    9. (trap-disarmed ?loc) ;; It becomes true when a trap is disarmed

    Action Description: Hero can move if the - hero is at current location - cells are connected, - there is no trap in current loc, and - destination does not have a trap/monster/has-been-destroyed Effects move the hero, and destroy the original cell. No need to destroy the sword.

    Action name: move


    Your answer:
    Parameters:
    1. ?from - cells
    2. ?to - cells

    Preconditions:
    <br>(and<br> (connected ?from ?to)<br> (at-hero ?from)<br> (not (has-trap ?from))<br> (not (is-destroyed ?to))<br> (not (has-trap ?to))<br> (not (has-monster ?to))<br>)<br>

    Effects:
    <br>(and<br> (at-hero ?to)<br> (is-destroyed ?from)<br> (not (at-hero ?from))<br>)<br>
    Parameters:
    1. ?from - cells
    2. ?to - cells

    Preconditions:
    <br>(and<br> (connected ?from ?to)<br> (at-hero ?from)<br> (not (has-trap ?from))<br> (not (is-destroyed ?to))<br> (not (has-trap ?to))<br>)<br>

    Effects:
    <br>(and<br> (at-hero ?to)<br> (is-destroyed ?from)<br> (not (at-hero ?from))<br>)<br>
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: T5EncoderModel 
      (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
      (3): Normalize()
    ), 'temperature': 0.01}
    

Evaluation Dataset

json

  • Dataset: json
  • Size: 34 evaluation samples
  • Columns: anchor, positive, and negative
  • Approximate statistics based on the first 34 samples:
    anchor positive negative
    type string string string
    details
    • min: 256 tokens
    • mean: 256.0 tokens
    • max: 256 tokens
    • min: 82 tokens
    • mean: 127.94 tokens
    • max: 176 tokens
    • min: 64 tokens
    • mean: 121.15 tokens
    • max: 176 tokens
  • Samples:
    anchor positive negative
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: This domain is structured to allow organizing and managing books within a library setting. The actions and predicates support the movement of books between tables and shelves, ensuring that conditions like accessibility and the librarian's hands being free are met. Additionally, it includes managing book categories, shelf space, and check-out/return processes to reflect a more complex library system.

    A list of available predicates
    1. (on-shelf ?x ?y - book) ;; ?x is on top of ?y on the shelf
    2. (on-table ?x - book) ;; ?x is on the table
    3. (accessible ?x - book) ;; ?x is accessible (not covered)
    4. (hands-free) ;; The hands of the librarian are free
    5. (holding ?x - book) ;; The librarian is holding ?x
    6. (belongs-to-category ?x - book ?cat - category) ;; ?x belongs to the category ?cat
    7. (shelf-empty ?cat - category) ;; The shelf for category ?cat is empty
    8. (shelf-overflow ?cat - category) ;; The shelf for category ?cat is full
    9. (book-request ?book - book) ;; There is a request for book ?book
    10. (return-due ?book - book) ;; Book ?book is due for return
    11. (checked-out ?book - book) ;; Book ?book is checked out

    Action Description: Consider a librarian holding a book and standing near a shelf. The 'place-on-shelf' action involves placing the held book on top of another book on the shelf, given that the book on the shelf is accessible. This action results in the held book becoming accessible, the book on the shelf becoming inaccessible, and the librarian's hands becoming free.

    Action name: place-on-shelf


    Your answer:
    Parameters:
    1. ?x - book
    2. ?y - book
    3. ?cat - category

    Preconditions:
    <br>(and<br> (holding ?x)<br> (accessible ?y)<br> (belongs-to-category ?x ?cat)<br> (not (shelf-overflow ?cat))<br>)<br>

    Effects:
    <br>(and<br> (not (holding ?x))<br> (not (accessible ?y))<br> (accessible ?x)<br> (hands-free)<br> (on-shelf ?x ?y)<br> (shelf-empty ?cat)<br>)<br>
    Parameters:
    1. ?x - book
    2. ?y - book
    3. ?cat - category

    Preconditions:
    <br>(and<br> (accessible ?y)<br> (belongs-to-category ?x ?cat)<br> (not (shelf-overflow ?cat))<br>)<br>

    Effects:
    <br>(and<br> (not (shelf-empty ?cat))<br> (not (holding ?x))<br> (not (accessible ?y))<br> (accessible ?x)<br> (hands-free)<br> (on-shelf ?x ?y)<br>)<br>
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: This domain is structured to allow organizing and managing books within a library setting. The actions and predicates support the movement of books between tables and shelves, ensuring that conditions like accessibility and the librarian's hands being free are met. Additionally, it includes managing book categories, shelf space, and check-out/return processes to reflect a more complex library system.

    A list of available predicates
    1. (on-shelf ?x ?y - book) ;; ?x is on top of ?y on the shelf
    2. (on-table ?x - book) ;; ?x is on the table
    3. (accessible ?x - book) ;; ?x is accessible (not covered)
    4. (hands-free) ;; The hands of the librarian are free
    5. (holding ?x - book) ;; The librarian is holding ?x
    6. (belongs-to-category ?x - book ?cat - category) ;; ?x belongs to the category ?cat
    7. (shelf-empty ?cat - category) ;; The shelf for category ?cat is empty
    8. (shelf-overflow ?cat - category) ;; The shelf for category ?cat is full
    9. (book-request ?book - book) ;; There is a request for book ?book
    10. (return-due ?book - book) ;; Book ?book is due for return
    11. (checked-out ?book - book) ;; Book ?book is checked out

    Action Description: Put a book you're holding on top of another accessible book on the shelf.

    Action name: place-on-shelf


    Your answer:
    Parameters:
    1. ?x - book
    2. ?y - book
    3. ?cat - category

    Preconditions:
    <br>(and<br> (holding ?x)<br> (accessible ?y)<br> (belongs-to-category ?x ?cat)<br> (not (shelf-overflow ?cat))<br>)<br>

    Effects:
    <br>(and<br> (not (holding ?x))<br> (not (accessible ?y))<br> (accessible ?x)<br> (hands-free)<br> (on-shelf ?x ?y)<br> (shelf-empty ?cat)<br>)<br>
    Parameters:
    1. ?x - book
    2. ?y - book
    3. ?cat - category

    Preconditions:
    <br>(and<br> (holding ?x)<br> (accessible ?y)<br> (belongs-to-category ?x ?cat)<br> (not (shelf-overflow ?cat))<br>)<br>

    Effects:
    <br>(and<br> (not (hands-free))<br> (not (holding ?x))<br> (not (accessible ?y))<br> (accessible ?x)<br> (on-shelf ?x ?y)<br> (shelf-empty ?cat)<br>)<br>
    Context: Translate the given natural language description into an action schema that includes the parameters, preconditions, and effects. Ensure that only the provided predicates are used to construct the preconditions and effects.
    Question: Here is the task.
    A natural language description of the domain
    Domain information: This domain is structured to allow organizing and managing books within a library setting. The actions and predicates support the movement of books between tables and shelves, ensuring that conditions like accessibility and the librarian's hands being free are met. Additionally, it includes managing book categories, shelf space, and check-out/return processes to reflect a more complex library system.

    A list of available predicates
    1. (on-shelf ?x ?y - book) ;; ?x is on top of ?y on the shelf
    2. (on-table ?x - book) ;; ?x is on the table
    3. (accessible ?x - book) ;; ?x is accessible (not covered)
    4. (hands-free) ;; The hands of the librarian are free
    5. (holding ?x - book) ;; The librarian is holding ?x
    6. (belongs-to-category ?x - book ?cat - category) ;; ?x belongs to the category ?cat
    7. (shelf-empty ?cat - category) ;; The shelf for category ?cat is empty
    8. (shelf-overflow ?cat - category) ;; The shelf for category ?cat is full
    9. (book-request ?book - book) ;; There is a request for book ?book
    10. (return-due ?book - book) ;; Book ?book is due for return
    11. (checked-out ?book - book) ;; Book ?book is checked out

    Action Description: Imagine a patron returning a borrowed book to the library. The 'return-book' action enables the librarian to process the return, updating the book's status and removing any return due date. This action is applicable when the librarian is holding the book that needs to be returned.

    Action name: return-book


    Your answer:
    Parameters:
    1. ?x - book

    Preconditions:
    <br>(and<br> (checked-out ?x)<br> (holding ?x)<br>)<br>

    Effects:
    <br>(and<br> (not (checked-out ?x))<br> (not (holding ?x))<br> (not (book-request ?x))<br> (not (return-due ?x))<br> (accessible ?x)<br> (hands-free)<br>)<br>
    Parameters:
    1. ?x - book

    Preconditions:
    <br>(and<br> (not (checked-out ?x))<br> (holding ?x)<br>)<br>

    Effects:
    <br>(and<br> (return-due ?x)<br> (checked-out ?x)<br> (not (holding ?x))<br> (not (book-request ?x))<br> (accessible ?x)<br> (hands-free)<br>)<br>
  • Loss: GISTEmbedLoss with these parameters:
    {'guide': SentenceTransformer(
      (0): Transformer({'max_seq_length': 256, 'do_lower_case': False}) with Transformer model: T5EncoderModel 
      (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
      (2): Dense({'in_features': 1024, 'out_features': 768, 'bias': False, 'activation_function': 'torch.nn.modules.linear.Identity'})
      (3): Normalize()
    ), 'temperature': 0.01}
    

Training Hyperparameters

Non-Default Hyperparameters

  • eval_strategy: steps
  • per_device_train_batch_size: 256
  • num_train_epochs: 40
  • max_steps: 31251
  • warmup_ratio: 0.1
  • fp16: True
  • batch_sampler: no_duplicates

All Hyperparameters

Click to expand
  • overwrite_output_dir: False
  • do_predict: False
  • eval_strategy: steps
  • prediction_loss_only: True
  • per_device_train_batch_size: 256
  • per_device_eval_batch_size: 8
  • per_gpu_train_batch_size: None
  • per_gpu_eval_batch_size: None
  • gradient_accumulation_steps: 1
  • eval_accumulation_steps: None
  • learning_rate: 5e-05
  • weight_decay: 0.0
  • adam_beta1: 0.9
  • adam_beta2: 0.999
  • adam_epsilon: 1e-08
  • max_grad_norm: 1.0
  • num_train_epochs: 40
  • max_steps: 31251
  • lr_scheduler_type: linear
  • lr_scheduler_kwargs: {}
  • warmup_ratio: 0.1
  • warmup_steps: 0
  • log_level: passive
  • log_level_replica: warning
  • log_on_each_node: True
  • logging_nan_inf_filter: True
  • save_safetensors: True
  • save_on_each_node: False
  • save_only_model: False
  • restore_callback_states_from_checkpoint: False
  • no_cuda: False
  • use_cpu: False
  • use_mps_device: False
  • seed: 42
  • data_seed: None
  • jit_mode_eval: False
  • use_ipex: False
  • bf16: False
  • fp16: True
  • fp16_opt_level: O1
  • half_precision_backend: auto
  • bf16_full_eval: False
  • fp16_full_eval: False
  • tf32: None
  • local_rank: 0
  • ddp_backend: None
  • tpu_num_cores: None
  • tpu_metrics_debug: False
  • debug: []
  • dataloader_drop_last: False
  • dataloader_num_workers: 0
  • dataloader_prefetch_factor: None
  • past_index: -1
  • disable_tqdm: False
  • remove_unused_columns: True
  • label_names: None
  • load_best_model_at_end: False
  • ignore_data_skip: False
  • fsdp: []
  • fsdp_min_num_params: 0
  • fsdp_config: {'min_num_params': 0, 'xla': False, 'xla_fsdp_v2': False, 'xla_fsdp_grad_ckpt': False}
  • fsdp_transformer_layer_cls_to_wrap: None
  • accelerator_config: {'split_batches': False, 'dispatch_batches': None, 'even_batches': True, 'use_seedable_sampler': True, 'non_blocking': False, 'gradient_accumulation_kwargs': None}
  • deepspeed: None
  • label_smoothing_factor: 0.0
  • optim: adamw_torch
  • optim_args: None
  • adafactor: False
  • group_by_length: False
  • length_column_name: length
  • ddp_find_unused_parameters: None
  • ddp_bucket_cap_mb: None
  • ddp_broadcast_buffers: False
  • dataloader_pin_memory: True
  • dataloader_persistent_workers: False
  • skip_memory_metrics: True
  • use_legacy_prediction_loop: False
  • push_to_hub: False
  • resume_from_checkpoint: None
  • hub_model_id: None
  • hub_strategy: every_save
  • hub_private_repo: False
  • hub_always_push: False
  • gradient_checkpointing: False
  • gradient_checkpointing_kwargs: None
  • include_inputs_for_metrics: False
  • eval_do_concat_batches: True
  • fp16_backend: auto
  • push_to_hub_model_id: None
  • push_to_hub_organization: None
  • mp_parameters:
  • auto_find_batch_size: False
  • full_determinism: False
  • torchdynamo: None
  • ray_scope: last
  • ddp_timeout: 1800
  • torch_compile: False
  • torch_compile_backend: None
  • torch_compile_mode: None
  • dispatch_batches: None
  • split_batches: None
  • include_tokens_per_second: False
  • include_num_input_tokens_seen: False
  • neftune_noise_alpha: None
  • optim_target_modules: None
  • batch_eval_metrics: False
  • eval_on_start: False
  • batch_sampler: no_duplicates
  • multi_dataset_batch_sampler: proportional

Training Logs

Epoch Step Training Loss loss
0.0013 1 13.401 -
0.0128 10 12.9494 -
0.0256 20 11.0057 -
0.0384 30 6.5825 -
0.0512 40 3.0534 -
0.0639 50 2.38 3.4251
0.0767 60 1.7338 -
0.0895 70 1.4789 -
0.1023 80 1.3218 -
0.1151 90 1.042 -
0.1279 100 1.06 0.8531
0.1407 110 0.9958 -
0.1535 120 0.9441 -
0.1662 130 0.7988 -
0.1790 140 0.7177 -
0.1918 150 0.8212 0.4972
0.2046 160 0.6318 -
0.2174 170 0.507 -
0.2302 180 0.6205 -
0.2430 190 0.5646 -
0.2558 200 0.5621 0.2187
0.2685 210 0.5915 -
0.2813 220 0.4457 -
0.2941 230 0.481 -
0.3069 240 0.4547 -
0.3197 250 0.443 0.1757
0.3325 260 0.3636 -
0.3453 270 0.4799 -
0.3581 280 0.3901 -
0.3708 290 0.3003 -
0.3836 300 0.25 0.1038
0.3964 310 0.2678 -
0.4092 320 0.3444 -
0.4220 330 0.2106 -
0.4348 340 0.2709 -
0.4476 350 0.2826 0.3395
0.4604 360 0.2434 -
0.4731 370 0.2208 -
0.4859 380 0.2434 -
0.4987 390 0.2766 -
0.5115 400 0.2067 0.0890
0.5243 410 0.2169 -
0.5371 420 0.2233 -
0.5499 430 0.1617 -
0.5627 440 0.1772 -
0.5754 450 0.1642 0.1299
0.5882 460 0.1493 -
0.6010 470 0.1507 -
0.6138 480 0.145 -
0.6266 490 0.1137 -
0.6394 500 0.1797 0.1373
0.6522 510 0.142 -
0.6650 520 0.1299 -
0.6777 530 0.0861 -
0.6905 540 0.1347 -
0.7033 550 0.0868 0.3025
0.7161 560 0.2161 -
0.7289 570 0.1281 -
0.7417 580 0.1241 -
0.7545 590 0.0554 -
0.7673 600 0.1829 0.7387
0.7801 610 0.1516 -
0.7928 620 0.094 -
0.8056 630 0.0902 -
0.8184 640 0.1677 -
0.8312 650 0.0541 0.3037
0.8440 660 0.1283 -
0.8568 670 0.1334 -
0.8696 680 0.1791 -
0.8824 690 0.1431 -
0.8951 700 0.0935 0.4147
0.9079 710 0.04 -
0.9207 720 0.1699 -
0.9335 730 0.1293 -
0.9463 740 0.1027 -
0.9591 750 0.1299 0.0023
0.9719 760 0.088 -
0.9847 770 0.0886 -
0.9974 780 0.0636 -
1.0102 790 0.1167 -
1.0230 800 0.0653 0.1323
1.0358 810 0.1378 -
1.0486 820 0.0778 -
1.0614 830 0.1212 -
1.0742 840 0.0472 -
1.0870 850 0.0861 0.0161

Framework Versions

  • Python: 3.10.14
  • Sentence Transformers: 3.1.0.dev0
  • Transformers: 4.42.3
  • PyTorch: 2.3.1+cu121
  • Accelerate: 0.32.1
  • Datasets: 2.20.0
  • Tokenizers: 0.19.1

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

GISTEmbedLoss

@misc{solatorio2024gistembed,
    title={GISTEmbed: Guided In-sample Selection of Training Negatives for Text Embedding Fine-tuning},
    author={Aivin V. Solatorio},
    year={2024},
    eprint={2402.16829},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}
Downloads last month
5
Safetensors
Model size
355M params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for huangsukai/finetuned-sentence-encoder-for-pddl-gen

Finetuned
(8)
this model