issue running a model using SentenceTransformer
I am trying to run the following code given as sample, using SentenceTransformer library.
from sentence_transformers import SentenceTransformer
model = SentenceTransformer(
"nomic-ai/nomic-embed-text-v2-moe",
trust_remote_code=True
)
sentences = ["Hello!", "¡Hola!"]
embeddings = model.encode(sentences, prompt_name="passage")
The model returns an error, saying TypeError: MoE.forward() takes 2 positional arguments but 3 were given
I would really appreciate if anyone could give me guidance on how to resolve this issue
Full error code:
python temp.py
Traceback (most recent call last):
File "/home/chingiztuleubayev/fine-tune/src/temp.py", line 10, in <module>
embeddings = model.encode(sentences, prompt_name="passage")
File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 623, in encode
out_features = self.forward(features, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/SentenceTransformer.py", line 690, in forward
input = module(input, **module_kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/sentence_transformers/models/Transformer.py", line 393, in forward
output_states = self.auto_model(**trans_features, **kwargs, return_dict=False)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chingiztuleubayev/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/359596ab182dcf943b7ca9e3f8809b6c2eaf652f/modeling_hf_nomic_bert.py", line 1910, in forward
sequence_output = self.encoder(hidden_states, attention_mask=attention_mask, return_dict=return_dict)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chingiztuleubayev/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/359596ab182dcf943b7ca9e3f8809b6c2eaf652f/modeling_hf_nomic_bert.py", line 1789, in forward
hidden_states, hidden_states2, residual = layer(
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
File "/home/chingiztuleubayev/.cache/huggingface/modules/transformers_modules/nomic-ai/nomic-bert-2048/359596ab182dcf943b7ca9e3f8809b6c2eaf652f/modeling_hf_nomic_bert.py", line 1718, in forward
mlp_out = self.mlp(hidden_states, torch.where(attention_mask.squeeze() == 0, 1, 0))
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1736, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
return forward_call(*args, **kwargs)
TypeError: MoE.forward() takes 2 positional arguments but 3 were given
apologies for this, can you run
pip list
so i can debug this please?
Thanks for quick response, here is the output:
pip list
Package Version
---------------------------------------- --------------
absl-py 2.1.0
accelerate 1.2.0
aiofiles 22.1.0
aiohttp 3.9.5
aiohttp-cors 0.7.0
aiosignal 1.3.1
aiosqlite 0.20.0
alembic 1.14.1
annotated-types 0.7.0
ansicolors 1.1.8
anyio 4.6.2.post1
appdirs 1.4.4
archspec 0.2.3
argon2-cffi 23.1.0
argon2-cffi-bindings 21.2.0
arrow 1.3.0
asttokens 2.4.1
async-timeout 4.0.3
atpublic 4.1.0
attrs 24.2.0
babel 2.16.0
backports.tarfile 1.2.0
beatrix_jupyterlab 2024.920.84202
beautifulsoup4 4.12.3
bidict 0.23.1
bigframes 0.22.0
bleach 6.2.0
blessed 1.20.0
blinker 1.9.0
bokeh 3.6.2
boltons 24.0.0
Brotli 1.1.0
build 1.2.2.post1
cachetools 5.5.0
certifi 2024.12.14
cffi 1.17.1
charset-normalizer 3.4.0
click 8.1.7
cloud-tpu-client 0.10
cloudpickle 3.1.0
cmake 3.31.2
colorama 0.4.6
coloredlogs 15.0.1
colorful 0.5.6
colorlog 6.9.0
comm 0.2.2
conda 24.9.2
conda-libmamba-solver 24.9.0
conda-package-handling 2.4.0
conda_package_streaming 0.11.0
ConfigArgParse 1.7
contourpy 1.3.1
cryptography 43.0.3
cupy-cuda12x 13.3.0
cycler 0.12.1
Cython 3.0.11
dacite 1.8.1
dataproc_jupyter_plugin 0.1.80
datasets 2.21.0
db-dtypes 1.3.1
debugpy 1.8.8
decorator 5.1.1
defusedxml 0.7.1
Deprecated 1.2.15
dill 0.3.8
distlib 0.3.9
distro 1.9.0
dm-tree 0.1.8
docker 7.1.0
docstring_parser 0.16
einops 0.8.0
entrypoints 0.4
eval_type_backport 0.2.2
exceptiongroup 1.2.2
executing 2.1.0
Farama-Notifications 0.0.4
fastapi 0.115.5
fastembed 0.4.2
fastjsonschema 2.20.0
fastrlock 0.8.2
filelock 3.16.1
flash-attn 2.7.2.post1
Flask 3.1.0
Flask-Cors 5.0.0
Flask-Login 0.6.3
flatbuffers 24.3.25
fonttools 4.55.0
fqdn 1.5.1
frozendict 2.4.6
frozenlist 1.5.0
fsspec 2024.6.1
gcsfs 2024.10.0
gdown 5.2.0
geopandas 1.0.1
gevent 24.11.1
geventhttpclient 2.3.3
gitdb 4.0.11
GitPython 3.1.43
google-api-core 1.34.1
google-api-python-client 1.8.0
google-auth 2.36.0
google-auth-httplib2 0.2.0
google-auth-oauthlib 1.2.1
google-cloud-aiplatform 1.72.0
google-cloud-artifact-registry 1.13.1
google-cloud-bigquery 3.25.0
google-cloud-bigquery-connection 1.16.1
google-cloud-bigquery-storage 2.27.0
google-cloud-core 2.4.1
google-cloud-datastore 1.15.5
google-cloud-functions 1.18.1
google-cloud-iam 2.16.1
google-cloud-jupyter-config 0.0.10
google-cloud-language 2.15.1
google-cloud-monitoring 2.23.1
google-cloud-resource-manager 1.13.1
google-cloud-storage 2.14.0
google-crc32c 1.6.0
google-resumable-media 2.7.2
googleapis-common-protos 1.66.0
gpustat 1.0.0
greenlet 3.1.1
grouped_gemm 0.1.6
grpc-google-iam-v1 0.13.1
grpcio 1.68.1
grpcio-status 1.48.2
grpcio-tools 1.68.1
gymnasium 1.0.0
h11 0.14.0
h2 4.1.0
hdbscan 0.8.40
hpack 4.0.0
htmlmin 0.1.12
httpcore 1.0.7
httplib2 0.22.0
httptools 0.6.4
httpx 0.28.1
huggingface-hub 0.27.0
humanfriendly 10.0
humanize 4.11.0
hyperframe 6.0.1
ibis-framework 7.1.0
idna 3.10
ImageHash 4.3.1
imageio 2.36.0
importlib_metadata 8.4.0
importlib_resources 6.4.5
ipykernel 6.29.5
ipython 8.21.0
ipython-genutils 0.2.0
ipython-sql 0.5.0
ipywidgets 8.1.5
isoduration 20.11.0
itsdangerous 2.2.0
jaraco.classes 3.4.0
jaraco.context 6.0.1
jaraco.functools 4.1.0
jedi 0.19.2
jeepney 0.8.0
Jinja2 3.1.4
joblib 1.4.2
json5 0.9.28
jsonpatch 1.33
jsonpointer 3.0.0
jsonschema 4.23.0
jsonschema-specifications 2024.10.1
jupyter_client 7.4.9
jupyter_core 5.7.2
jupyter-events 0.10.0
jupyter-http-over-ws 0.0.8
jupyter_server 2.14.2
jupyter_server_fileid 0.9.3
jupyter-server-mathjax 0.2.6
jupyter_server_proxy 4.4.0
jupyter_server_terminals 0.5.3
jupyter_server_ydoc 0.8.0
jupyter-ydoc 0.2.5
jupyterlab 3.6.8
jupyterlab_git 0.44.0
jupyterlab_pygments 0.3.0
jupyterlab_server 2.27.3
jupyterlab_widgets 3.0.13
jupytext 1.16.4
kernels-mixer 0.0.15
keyring 25.5.0
keyrings.google-artifactregistry-auth 1.1.2
kfp 2.5.0
kfp-pipeline-spec 0.2.2
kfp-server-api 2.0.5
kiwisolver 1.4.7
kubernetes 26.1.0
lazy_loader 0.4
libmambapy 1.5.10
lightning 2.5.0.post0
lightning-utilities 0.11.9
linkify-it-py 2.0.3
llvmlite 0.41.1
locust 2.32.9
loguru 0.7.3
lz4 4.3.3
Mako 1.3.8
mamba 1.5.10
Markdown 3.7
markdown-it-py 3.0.0
MarkupSafe 3.0.2
matplotlib 3.7.3
matplotlib-inline 0.1.7
mdit-py-plugins 0.4.2
mdurl 0.1.2
megablocks 0.7.0
memray 1.14.0
menuinst 2.2.0
mistune 3.0.2
mmh3 4.1.0
more-itertools 10.5.0
mpmath 1.3.0
msgpack 1.1.0
mteb 1.31.1
multidict 6.1.0
multimethod 1.12
multipledispatch 1.0.0
multiprocess 0.70.16
nbclassic 1.1.0
nbclient 0.10.0
nbconvert 7.16.4
nbdime 3.2.0
nbformat 5.10.4
nest-asyncio 1.6.0
networkx 3.4.2
ninja 1.11.1.3
notebook 6.5.7
notebook_executor 0.2
notebook_shim 0.2.4
numba 0.58.1
numpy 1.25.0
nvidia-cublas-cu12 12.4.5.8
nvidia-cuda-cupti-cu12 12.4.127
nvidia-cuda-nvrtc-cu12 12.4.127
nvidia-cuda-runtime-cu12 12.4.127
nvidia-cudnn-cu12 9.1.0.70
nvidia-cufft-cu12 11.2.1.3
nvidia-curand-cu12 10.3.5.147
nvidia-cusolver-cu12 11.6.1.9
nvidia-cusparse-cu12 12.3.1.170
nvidia-ml-py 11.495.46
nvidia-nccl-cu12 2.21.5
nvidia-nvjitlink-cu12 12.4.127
nvidia-nvtx-cu12 12.4.127
oauth2client 4.1.3
oauthlib 3.2.2
onnx 1.17.0
onnxruntime 1.19.2
opencensus 0.11.4
opencensus-context 0.1.3
opentelemetry-api 1.27.0
opentelemetry-exporter-otlp 1.27.0
opentelemetry-exporter-otlp-proto-common 1.27.0
opentelemetry-exporter-otlp-proto-grpc 1.27.0
opentelemetry-exporter-otlp-proto-http 1.27.0
opentelemetry-proto 1.27.0
opentelemetry-sdk 1.27.0
opentelemetry-semantic-conventions 0.48b0
optimum 1.23.3
optuna 4.2.0
overrides 7.7.0
packaging 24.1
pandas 2.0.3
pandas-profiling 3.6.6
pandocfilters 1.5.1
papermill 2.6.0
parso 0.8.4
parsy 2.1
patsy 1.0.1
pendulum 3.0.0
pexpect 4.9.0
phik 0.12.4
pillow 10.4.0
pins 0.8.6
pip 24.3.1
platformdirs 4.3.6
plotly 5.24.1
pluggy 1.5.0
polars 1.21.0
portalocker 2.10.1
prettytable 3.12.0
prometheus_client 0.21.0
prompt_toolkit 3.0.48
propcache 0.2.0
proto-plus 1.25.0
protobuf 5.29.2
psutil 5.9.3
ptyprocess 0.7.0
pure_eval 0.2.3
py_rust_stemmers 0.1.3
py-spy 0.4.0
pyarrow 18.1.0
pyarrow-hotfix 0.6
pyasn1 0.6.1
pyasn1_modules 0.4.1
pycosat 0.6.6
pycparser 2.22
pydantic 2.10.4
pydantic_core 2.27.2
pydata-google-auth 1.8.2
Pygments 2.18.0
PyJWT 2.10.0
pynndescent 0.5.13
pyogrio 0.10.0
pyOpenSSL 24.2.1
pyparsing 3.2.0
pyproj 3.7.0
pyproject_hooks 1.2.0
PySocks 1.7.1
PyStemmer 2.2.0.3
python-dateutil 2.9.0.post0
python-dotenv 1.0.1
python-json-logger 2.0.7
pytorch-lightning 2.5.0.post0
pytrec-eval-terrier 0.5.6
pytz 2024.2
PyWavelets 1.7.0
PyYAML 6.0.2
pyzmq 26.2.0
qdrant-client 1.12.1
ray 2.39.0
referencing 0.35.1
regex 2024.11.6
requests 2.32.3
requests-oauthlib 2.0.0
requests-toolbelt 0.10.1
retrying 1.3.4
rfc3339-validator 0.1.4
rfc3986 1.5.0
rfc3986-validator 0.1.1
rich 13.9.4
rpds-py 0.21.0
rsa 4.9
ruamel.yaml 0.18.6
ruamel.yaml.clib 0.2.8
safetensors 0.4.5
scikit-image 0.24.0
scikit-learn 1.5.2
scipy 1.11.4
seaborn 0.12.2
SecretStorage 3.3.3
Send2Trash 1.8.3
sentence-transformers 3.3.1
setuptools 75.3.0
shapely 2.0.6
shellingham 1.5.4
simpervisor 1.0.0
six 1.16.0
smart-open 7.0.5
smmap 5.0.1
sniffio 1.3.1
snowballstemmer 2.2.0
soupsieve 2.6
SQLAlchemy 2.0.36
sqlglot 19.9.0
sqlparse 0.5.2
stack-data 0.6.3
stanford-stk 0.7.1
starlette 0.41.3
statsmodels 0.14.4
sympy 1.13.1
tabulate 0.9.0
tangled-up-in-unicode 0.2.0
tenacity 9.0.0
tensorboard 2.18.0
tensorboard-data-server 0.7.2
tensorboardX 2.6.2.2
terminado 0.18.1
textual 0.86.2
threadpoolctl 3.5.0
tifffile 2024.9.20
time-machine 2.16.0
tinycss2 1.4.0
tokenizers 0.21.0
tomli 2.1.0
toolz 0.12.1
torch 2.5.1+cu124
torchaudio 2.5.1+cu124
torchmetrics 1.6.1
torchvision 0.20.1+cu124
tornado 6.4.1
tqdm 4.67.1
traitlets 5.14.3
transformers 4.47.1
triton 3.1.0
truststore 0.10.0
typeguard 4.4.1
typer 0.13.1
types-python-dateutil 2.9.0.20241003
typing_extensions 4.12.2
tzdata 2024.2
uc-micro-py 1.0.3
umap 0.1.1
umap-learn 0.5.7
uri-template 1.3.0
uritemplate 3.0.1
urllib3 2.3.0
uvicorn 0.32.0
uvloop 0.21.0
virtualenv 20.27.1
visions 0.7.5
watchfiles 0.24.0
wcwidth 0.2.13
webcolors 24.11.1
webencodings 0.5.1
websocket-client 1.8.0
websockets 14.1
Werkzeug 3.1.3
wheel 0.45.0
widgetsnbextension 4.0.13
wordcloud 1.9.4
wrapt 1.16.0
xgboost 2.1.3
xxhash 3.5.0
xyzservices 2025.1.0
y-py 0.6.2
yarl 1.17.2
ydata-profiling 4.6.0
ypy-websocket 0.8.4
zipp 3.21.0
zope.event 5.0
zope.interface 7.2
zstandard 0.23.0
Ah we had to make some modifications for training to megablocks, can you try installing megablocks with
pip install git+https://github.com/nomic-ai/megablocks.git
you can also run it without megablocks if you'd like but performance will be quite slow comparatively
Thanks for the quick fix. I was able to get it working without megablocks for a quick test.
I reckon the model should be immediately finetune-able with Sentence Transformers (and presumably also with contrastors!)
- Tom Aarsen