BAAI/OpenSeek-Small-v1

about 1 month ago

Current implementation is broken. Your usage instructions need to include a trust remote code = true.

Also, what are you using for position embeddings? I can't get anything out of this model except incoherent loops.

Beijing Academy of Artificial Intelligence org 30 days ago

Thanks for your feedback.
trust_remote_code=True added in usage instructions.
Position embedding: https://huggingface.co/BAAI/OpenSeek-Small-v1/blob/main/modeling_deepseek.py#L701-L704

Robertp423

30 days ago

Thanks for clarifying. For transparency, could you confirm whether the model uses both rotary and absolute position embeddings? The docs and code seem to diverge, which might confuse others trying to reproduce your results.

Robertp423

29 days ago

•

edited 29 days ago

It's actually completely broken for me now, but I'm not an expert.

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)
tokenizer = AutoTokenizer.from_pretrained("BAAI/OpenSeek-Small-v1",trust_remote_code=True)

inputs = tokenizer("The future of AI is", return_tensors="pt")
outputs = model.generate(**inputs, max_length=50)
Traceback (most recent call last):
File "", line 1, in
outputs = model.generate(**inputs, max_length=50)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 2357, in generate
self._validate_model_kwargs(model_kwargs.copy())
~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\rober\AppData\Local\Programs\Python\Python313\Lib\site-packages\transformers\generation\utils.py", line 1599, in _validate_model_kwargs
raise ValueError(
...<2 lines>...
)
ValueError: The following model_kwargs are not used by the model: ['token_type_ids'] (note: typos in the generate arguments will also show up in this list)

Adding "inputs.pop("token_type_ids", None)" fixes it.

BAAI
/

OpenSeek-Small-v1

Please fix