What I Learned Upscaling a Long-distance Midjourney Photo w/ Stable Diffusion PLUS unboxing Qwen Image & Wan 2.2

Community Article Published August 8, 2025

TODAY

tested wan 2.2, qwen image, finegraint img upscaler

see the comments section

Launch Post (story format)

Greenhouse-bay hush. Leaves tilt toward a fake sun; old machines nap in the corners. Two orange suits step in, sleeve patches—tiny Konnektron and Objas—winking like inside jokes.

HOPE: “Hi there! Welcome to our new machine learning blog.”

JUNIPER: “Just pulling all this synthetic data. Here we go!”

Juniper lifts a transparent tablet. The glass blooms: a blue lattice knitting itself into sense— noisy → clean (Text Diffusion), clean → structured (schema), structured → clean (regenerate). Side rail ticks: GNN/GAT for links, LLM for ops.

JUNIPER: “Send your intent in a few words. The attention graph and text diffuser will do the rest.”

HOPE (to agent): “inspect valve room, reduce downtime.”

The graph inhales. Panels slide in like drawers in a tidy lab: ingest → embeddings → workflows → insights → emissions. Badges flicker: Postgres, orchestration, agents online. A small Konnektron icon spins—and the run begins to purr.

Light through the canopy; a breeze stirs the plants. Not flashy—confident. Like a good engine that knows its work.

JUNIPER: “New blog will be fun.”

HOPE: “We’ll post wins and flops here as we experiment with our daily H200 gpu allocation.”

JUNIPER: “See you next time!”

From reindustrialized floors to green bays, the promise holds: clearer context, safer ops, faster delivery. The tablet dims to a calm heartbeat of light.

Long-distance Lower-detail Midjourney

image/png

see the comments section for the higher resolution upscale solution using stable diffusion

Launch Post (log format)

data scientist's log — blog launch

  • published the first entry introducing our context-to-pipeline system

  • core stack includes text diffusion for noisy→clean prompt mapping, gnn/gat for linking, and llm for execution

  • demo shows a short instruction expanded into a complete workflow: ingest → embeddings → workflows → insights → emissions

  • runs on konnektron hardware with postgres, orchestration, and agents online

  • spring focus: command layer buildout (automation, triage, memory)

  • summer focus: prediction/generative layers and full data factory buildout

  • large-batch runs executed on hugging face pro with daily h200 allocation

  • results, benchmarks, and iteration notes will be posted here

Graph Networks has Hope's full attention

image/png

Community

First day experimenting with Hugging Face spaces

HOW I SPENT MY 25min H200 ALLOCATION

Tested

Results

  • Wan 2.2
    • impressive video generation, tested a couple images from midjourney to compare video against that service
    • very comparable! does not seem to have as much support for diversity and a couple characters had strange effects
    • otherwise very cool first result -- will test more thoroughly locally in coming weeks
  • Qwen Image
    • Very strong results, producing fashion runway images with people holding signs with my project names
    • Photorealistic every time. Decent fashion. Very strong adherence to the prompt. Text was flawless 3/4 times. Struggled with "build w/ company name" but perfect with "build with company name"
    • image is unstyled as if straight from camera body. folks with post-prod skills may prefer this to biased results such as from Midjourney. (or not, if you are perfectly aligned with midjourney styles, which do suit me)
  • Finegrain
    • 60s H200 GPU time for single inference was requested
    • does a good job overall!
    • local M1 Max run (MPS, fp16) at full res took ~117s with normal workload, matching quality closely after removing 768px cap
      • swapping to fp32 + higher ControlNet scale improved structure fidelity
      • Note: the default HF app.py downsized inputs to 768 px short side; removing that script behavior was key to preserving detail and matching HF visual fidelity.
    • does not adhere to characters, will completely change the face; clothing works well, background perfect
      • with this, i can photoshop the previous face into the new upscaled image, if wanting to keep the additional new details
      • lowering denoise_strength (0.2–0.25) and raising controlnet_scale (0.65–0.7) reduced unwanted changes
    • used default settings initially — explored presets for “upscale” vs “detail boost” for better control
    • like wan 2.2, it struggles with skin diversity so a couple characters changed race or had weird fabrics on their face
      • this bias was present in both HF and local runs, traceable to Stable Diffusion 1.5 / LAION dataset limitations

image.png

Article author

image.png

Close up

Midjourney

Original close up portrait using the omni character builder , and style references from previous iterations

image.png

Long Distance Upscaler

Original was produced in Midjourney, with the faces quite distorted.

Used Finegrain Upscale (Stable Diffusion) locally on 3080 and managed to get something closer to what I wanted.

See additional notes up this thread for first iterations.

Upscaled Locally

image.png

Midjourney Original

image.png

———————

## Finegrain Image Enhancer – Bias-Resistant Preset

**Prompt**  

4k photo of two women standing at the entrance to an indoor farming manufacturing facility, woman on the left is african american, woman on the right is caucasian


**Negative Prompt**  

worst quality, low quality, blurry

**Seed**  

8734

**Settings**  
- **Upscale Factor**: `2`  
- **ControlNet Scale**: `0.7`  
- **ControlNet Scale Decay**: `0.5`  
- **Condition Scale**: `2`  
- **Latent Tile Width**: `112`  
- **Latent Tile Height**: `144`  
- **Denoise Strength**: `0.2`  
- **Number of Inference Steps**: `21`  
- **Solver**: `DDIM`  

Nicely Upscaled Long Distance Midjourney

Amidst_a_labyrinthine_maze_of_rusting_machinery_regu_ddffeb4e-c445-44cb-b50b-97b3966b53eb--success-upscale.jpg

Sign up or log in to comment