Please fix

#1
by Robertp423 - opened

Current implementation is broken. Your usage instructions need to include a trust remote code = true.

Also, what are you using for position embeddings? I can't get anything out of this model except incoherent loops.

Beijing Academy of Artificial Intelligence org

Thanks for your feedback.
trust_remote_code=True added in usage instructions.
Position embedding: https://huggingface.co/BAAI/OpenSeek-Small-v1/blob/main/modeling_deepseek.py#L701-L704

Thanks for clarifying. For transparency, could you confirm whether the model uses both rotary and absolute position embeddings? The docs and code seem to diverge, which might confuse others trying to reproduce your results.

It's actually completely broken for me now, but I'm not an expert.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)

inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
Traceback (most recent call last):
File "", line 1, in
outputs = model.generate(**inputs, max_length=50)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 2357, in generate
self._validate_model_kwargs(model_kwargs.copy())
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 1599, in _validate_model_kwargs
raise ValueError(
...<2 lines>...
)
ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

Adding "inputs.pop("token_type_ids", None)" fixes it.

Sign up or log in to comment