Issue Loading Tokenizer with Falcon 7B

#9
by MRamiBen - opened

Hello everyone,

I'm trying to load a tokenizer using the following command:

tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Instruct")

However, I'm encountering this error:

Click to expand the error traceback

---------------------------------------------------------------------------  
Exception                                 Traceback (most recent call last)  
Cell In[23], line 1  
----> 1 tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Instruct")   

File ~/GenAI/augmented_claims_manager/.venv/lib/python3.9/site-packages/transformers/models/auto/tokenization_auto.py:862, in AutoTokenizer.from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs)  
    858     if tokenizer_class is None:  
    859         raise ValueError(  
    860             f"Tokenizer class {tokenizer_class_candidate} does not exist or is not currently imported."  
    861         )  
--> 862     return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)  

... (truncated for brevity) ...

Exception: data did not match any variant of untagged enum ModelWrapper at line 664575 column 3
Has anyone encountered this issue before? Any guidance or suggestions on how to resolve it would be greatly appreciated. Thank you! πŸ™
Technology Innovation Institute org

Hi @MRamiBen
Can you try to update tokenizers library and run again your code ? pip install -U tokenizers

Thank you for your response!
my tokenizers library was on version 0.19. After updating it to version 0.21, the issue has been resolved

Sign up or log in to comment