Something wrong with tokenizer??

#4
by ctranslate2-4you - opened

Hello,

When I try to run this model using Huggingfaceembeddings from Langchain, which wraps sentence transformers, I get this error when trying to conduct a search and no results are returned:

There was a bug in Trie algorithm in tokenization. Attempting to recover. Please report it anyway.

This particular error originates from tokenization_utils.py within the transformers library and I've tried setting trust_remote_code among other troubleshooting steps.

@tomaarsen or the repository owners can help since you guys are the experts?

ctranslate2-4you changed discussion title from Something wrong with tokenizer to Something wrong with tokenizer??

Sign up or log in to comment