Please fix
Current implementation is broken. Your usage instructions need to include a trust remote code = true.
Also, what are you using for position embeddings? I can't get anything out of this model except incoherent loops.
Thanks for your feedback.
trust_remote_code=True added in usage instructions.
Position embedding: https://huggingface.co/BAAI/OpenSeek-Small-v1/blob/main/modeling_deepseek.py#L701-L704
Thanks for clarifying. For transparency, could you confirm whether the model uses both rotary and absolute position embeddings? The docs and code seem to diverge, which might confuse others trying to reproduce your results.
It's actually completely broken for me now, but I'm not an expert.
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
Traceback (most recent call last):
File "", line 1, in
outputs = model.generate(**inputs, max_length=50)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 2357, in generate
self._validate_model_kwargs(model_kwargs.copy())
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 1599, in _validate_model_kwargs
raise ValueError(
...<2 lines>...
)
ValueError: The followingmodel_kwargs
are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)
Adding "inputs.pop("token_type_ids", None)" fixes it.