TextEncoderQwenEditPlus node is missing target_size now
@Phr00t , not sure what happened, but previously that node had target_size below it, now its gone. I updated nodes_qwen.py in the comfy_extras folder, but still not seeing it. This helped immensely with zoom issues, can you let me know how to get that back?
Did you update ComfyUI? The update process should hopefully be stashing and reapplying changes, but I've had to update the ComfyUI node manually after updates. I should perhaps bite the bullet and make it an actual separate node but ugh. Look at your comfy_extras/nodes_qwen.py file and compare it with mine to make sure it matches. Then, reboot ComfyUI and refresh...
Can I suggest the following patch to your node?
It adds an empty latent output to connect to KSampler so that one can use the computed dimensions with the correct aspect ratio directly.
*************** class TextEncodeQwenImageEditPlus(io.ComfyNode):
*** 65,70 ****
--- 65,71 ----
],
outputs=[
io.Conditioning.Output(),
+ io.Latent.Output(),
],
)
*************** class TextEncodeQwenImageEditPlus(io.ComfyNode):
*** 76,81 ****
--- 77,86 ----
llama_template = "<|im_start|>system\nDescribe key details of the input image (including any objects, characters, poses, facial features, clothing, setting, textures and style), then explain how the user's text instruction should alter, modify or recreate the image. Generate a new image that meets the user's requirements, which can vary from a small change to a completely new image using inputs as a guide.<|im_end|>\n<|im_start|>user\n{}<|im_end|>\n<|im_start|>assistant\n"
image_prompt = ""
+ # Track computed dimensions for latent output
+ computed_width = None
+ computed_height = None
+
for i, image in enumerate(images):
if image is not None:
samples = image.movedim(-1, 1)
*************** class TextEncodeQwenImageEditPlus(io.ComfyNode):
*** 94,99 ****
--- 99,109 ----
height = int(samples.shape[2] * scale_by / 32) * 32
width = int(samples.shape[3] * scale_by / 32) * 32
+ # Store the computed dimensions from the first image
+ if computed_width is None:
+ computed_width = width
+ computed_height = height
+
s = comfy.utils.common_upscale(samples, width, height, "lanczos", "center")
ref_latents.append(vae.encode(s.movedim(1, -1)[:, :, :, :3]))
*************** class TextEncodeQwenImageEditPlus(io.ComfyNode):
*** 103,109 ****
conditioning = clip.encode_from_tokens_scheduled(tokens)
if len(ref_latents) > 0:
conditioning = node_helpers.conditioning_set_values(conditioning, {"reference_latents": ref_latents}, append=True)
! return io.NodeOutput(conditioning)
class QwenExtension(ComfyExtension):
@override
--- 113,129 ----
conditioning = clip.encode_from_tokens_scheduled(tokens)
if len(ref_latents) > 0:
conditioning = node_helpers.conditioning_set_values(conditioning, {"reference_latents": ref_latents}, append=True)
!
! # Create empty latent with computed dimensions
! if computed_width is not None and computed_height is not None:
! import torch
! latent = {"samples": torch.zeros((1, 4, computed_height // 8, computed_width // 8))}
! else:
! # Default empty latent if no images provided
! import torch
! latent = {"samples": torch.zeros((1, 4, target_size // 8, target_size // 8))}
!
! return io.NodeOutput(conditioning, latent)
class QwenExtension(ComfyExtension):
@override