RuntimeError: shape '[1, 33, 64, 2, 64, 2]' is invalid for input of size 327680
Trying to do inference using depth only, without inpainting. Is it possible? or all 3 (inpaint_image, inpaint_mask, control_image) are mandatory.
control_image = load_image("https://ostris.com/wp-content/uploads/2025/04/dog_depth.jpg")
image = pipe(
prompt="A white friendly robotic dog sitting on a bench",
control_image=control_image,
control_strength=0.5,
control_stop=0.33,
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
generator=torch.Generator("cpu").manual_seed(42)
).images[0]
image.save(f"robot_dog_control.png")
(sddw-dev) C:\aiOWN\diffuser_webui>python flex_2_NEW.py
Fetching 26 files: 100%|βββββββββββββββββββββββββββββββββββββββββ| 26/26 [00:00<00:00, 181.59it/s]
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββ| 2/2 [00:32<00:00, 16.01s/it]
Loading checkpoint shards: 100%|ββββββββββββββββββββββββββββββββββββ| 2/2 [00:52<00:00, 26.03s/it]
Keyword arguments {'trust_remote_code': True} are not expected by Flex2Pipeline and will be ignored.
Loading pipeline components...: 14%|βββββ | 1/7 [00:00<00:00, 6.01it/s]You set add_prefix_space
. The tokenizer needs to be converted from the slow tokenizers
Loading pipeline components...: 100%|βββββββββββββββββββββββββββββββ| 7/7 [00:01<00:00, 4.37it/s]
100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 50/50 [02:51<00:00, 3.44s/it]
Traceback (most recent call last):
File "C:\aiOWN\diffuser_webui\flex_2_NEW.py", line 71, in
image = pipe(
^^^^^
File "C:\Users\nitin\miniconda3\envs\sddw-dev\Lib\site-packages\torch\utils_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\nitin.cache\huggingface\modules\diffusers_modules\local\pipeline.py", line 319, in call
packed_latent_controls = self._pack_latents(
^^^^^^^^^^^^^^^^^^^
File "C:\Users\nitin\miniconda3\envs\sddw-dev\Lib\site-packages\diffusers\pipelines\flux\pipeline_flux_control.py", line 471, in _pack_latents
latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[1, 33, 64, 2, 64, 2]' is invalid for input of size 327680
Even text 2 image is having issues.
image = pipe(
prompt="A white friendly robotic dog sitting on a bench",
height=1024,
width=1024,
guidance_scale=3.5,
num_inference_steps=50,
generator=torch.Generator("cpu").manual_seed(42)
).images[0]
image.save(f"robot_dog.png")
File "C:\Users\nitin\miniconda3\envs\sddw-dev\Lib\site-packages\diffusers\pipelines\flux\pipeline_flux_control.py", line 471, in _pack_latents
latents = latents.view(batch_size, num_channels_latents, height // 2, 2, width // 2, 2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: shape '[1, 33, 64, 2, 64, 2]' is invalid for input of size 114688