AssertionError: Both operands must be same dtype. Got fp16 and bf16
#8
by
treehugg3
- opened
I get this error when running the demo sample script:
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 102:13:
zeros = tl.interleave(zeros, zeros)
zeros = tl.interleave(zeros, zeros)
zeros = tl.broadcast_to(zeros, (BLOCK_SIZE_K, BLOCK_SIZE_N))
offsets_s = N * offsets_szk[:, None] + offsets_sn[None, :]
masks_sk = offsets_szk < K // group_size
masks_s = masks_sk[:, None] & masks_sn[None, :]
scales_ptrs = scales_ptr + offsets_s
scales = tl.load(scales_ptrs, mask=masks_s)
scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))
b = (b >> shifts) & 0xF
^
IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
Ubuntu 22.04, latest git transformers. triton==3.2.0, autoawq==0.2.8
The error IncompatibleTypeErrorImpl('invalid operands of type triton.language.float16 and triton.language.float16')
is solved by ensuring you use torch_dtype="auto"
. Don't set it to torch.float16
like autoawq recommends.
But now I get this other error:
Traceback (most recent call last):
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/language/core.py", line 35, in wrapper
return fn(*args, **kwargs)
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/core.py", line 1548, in dot
return semantic.dot(input, other, acc, input_precision, max_num_imprecise_acc, out_dtype, _builder)
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/
language/semantic.py", line 1470, in dot
assert lhs.dtype == rhs.dtype, f"Both operands must be same dtype. Got {lhs.dtype} and {rhs.dtype}"
AssertionError: Both operands must be same dtype. Got fp16 and bf16
The above exception was the direct cause of the following exception:
File "/Qwen2.5-VL-32B-Instruct-AWQ/venv/lib/python3.10/site-packages/triton/compiler/compiler.py", line 100, in make_ir
return ast_to_ttir(self.fn, self, context=context, options=options, codegen_fns=codegen_fns,
triton.compiler.errors.CompilationError: at 108:22:
masks_s = masks_sk[:, None] & masks_sn[None, :]
scales_ptrs = scales_ptr + offsets_s
scales = tl.load(scales_ptrs, mask=masks_s)
scales = tl.broadcast_to(scales, (BLOCK_SIZE_K, BLOCK_SIZE_N))
b = (b >> shifts) & 0xF
zeros = (zeros >> shifts) & 0xF
b = (b - zeros) * scales
b = b.to(c_ptr.type.element_ty)
# Accumulate results.
accumulator = tl.dot(a, b, accumulator, out_dtype=accumulator_dtype)
^
treehugg3
changed discussion title from
invalid operands of type triton.language.float16 and triton.language.float16
to AssertionError: Both operands must be same dtype. Got fp16 and bf16
Temporary fix was to use this version of transformers:
pip install git+https://github.com/huggingface/transformers.git@8ee50537fe7613b87881cd043a85971c85e99519