RuntimeError: The size of tensor a (768) must match the size of tensor b (9216) at non-singleton dimension 2
#2
by
koute
- opened
Consider the following code:
import torch
import transformers
model_path = "../models/RWKV7-Goose-Pile-168M-HF"
tokenizer = transformers.AutoTokenizer.from_pretrained(model_path)
model = transformers.AutoModelForCausalLM.from_pretrained(model_path, device_map = "cuda", trust_remote_code = True)
inputs = tokenizer("Hello world!")
model(input_ids = torch.tensor(inputs["input_ids"], device = "cuda").unsqueeze(0))
This outputs:
Traceback (most recent call last):
File "./test-rwkv.py", line 11, in <module>
model(input_ids = torch.tensor(inputs["input_ids"], device = "cuda").unsqueeze(0))
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/transformers/utils/deprecation.py", line 172, in wrapped_func
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/fla/models/rwkv7/modeling_rwkv7.py", line 445, in forward
outputs = self.model(
^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/fla/models/rwkv7/modeling_rwkv7.py", line 314, in forward
hidden_states, attentions, past_key_values, v_first = layer(
^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/fla/models/rwkv7/modeling_rwkv7.py", line 156, in forward
hidden_states, attentions, past_key_values, v_first = self.attn(
^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".venv/lib/python3.11/site-packages/fla/layers/rwkv7.py", line 217, in forward
o = o + ((r * k * self.r_k).sum(-1, keepdim=True) * v).view(batch_size, seq_len, -1)
~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
RuntimeError: The size of tensor a (768) must match the size of tensor b (9216) at non-singleton dimension 2
Dependencies:
fla @ git+https://github.com/fla-org/flash-linear-attention@7eff5519d42629ee37453765d41057b393068a50#7eff5519d42629ee37453765d41057b393068a50
transformers @ git+https://github.com/huggingface/transformers@0ebd6651acd32c982fee265b23243b89bdb89577
torch==2.6.0
Is this just broken, or do I need some specific versions of the libraries?
Seems like it's fixed now in the newest version of flash linear attention.
koute
changed discussion status to
closed