Make compatible to sentence-transformers

one question regarding the max_seq_length, though not officially mentioned, the bge-en-icl model context window seems to be 32,768 according to MTEB leaderboard. for any reason you set max_seq_length to 4,096 in this change?

Update sentence_bert_config.json0eb14a49

Update tokenizer_config.jsond1a50b36

michaelfeil

Jan 9

•

edited Jan 13

@starsy Fair point. 4096 is only be the default max-length for sentence-transformer loading.

32768 will lead to an OOM for some users, yet 32768 is technically correct.

Beyond, for huggingface-transformer implementation, there is a sliding window of 4096. Beyond 4096, you need the flash-attn cuda extension installed to receive correct output, otherwise you will just have silently incorrect output as torch.sdpa does not support window_size=4096 causal fwd attention.

Leaving in 32768 for now! @Shitao appreciate your review.

Update modules.json80709037

michaelfeil

Jan 14

@Shitao Can you please review?

Shitao

Beijing Academy of Artificial Intelligence org Jan 15

Thanks for your contribution! @michaelfeil

Shitao changed pull request status to merged Jan 15

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment