[CLS] token representation or Pooled tokens?

by aarabil - opened Dec 10, 2024

Dec 10, 2024

How is the base model used during finetuning, do you use the [CLS] hidden token representation or do you pool the tokens together somehow (e.g. averaging)?

aarabil

10 days ago

Coming back to this question, especially since the base model is a finetuned embedding model in this case. Wondering how you adapt this model compared to a standard encoder for the nli task? Do you simply use the embedding model in the same way as the other models? If so, which separation token so you use?

MoritzLaurer

Owner 8 days ago

I used the exact same script as for other encoder-only models like RoBERTa or DeBERTa via the HF trainer, so I assume that the trainer used the CLS token

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment