File size: 9,481 Bytes
153f518
0f28326
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
153f518
 
 
 
 
 
 
 
 
 
 
f31d07e
153f518
 
 
 
 
 
 
 
 
 
 
 
 
 
d16a6e1
 
 
153f518
 
d16a6e1
e0d9f8b
153f518
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
f31d07e
153f518
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
# Shenhe lora Usage

## Model Source

The LoRA model used in this project is sourced from:

[TJ Flux Shenhe on CivitAI](https://civitai.com/models/866465/tj-flux-shenhe?modelVersionId=969578)

## Regional Flux Pipeline

The Regional Flux Pipeline utilized in this project is available at:

[Regional Prompting FLUX on GitHub](https://github.com/instantX-research/Regional-Prompting-FLUX)

## Acknowledgments

We would like to express our sincere gratitude to the creators and contributors of the LoRA model and the Regional Flux Pipeline for their valuable work and resources.

## Installtion
```bash
pip install -U diffusers transformers torch sentencepiece peft controlnet-aux moviepy protobuf
```

## Demo 
```python
import torch
from diffusers import FluxPipeline

pipe = FluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipe.load_lora_weights("svjack/FLUX_Shenhe_Lora")
pipe.enable_sequential_cpu_offload()

prompt = "tj_sthenhe, hair ornament,sliver hair,long hair,braid,"

image = pipe(prompt,
             num_inference_steps=24,
             guidance_scale=3.5,
            ).images[0]
image.save("shenhe.png")

from IPython import display
display.Image("shenhe.png", width=512, height=512)
```


![image/png](https://cdn-uploads.huggingface.co/production/uploads/634dffc49b777beec3bc6448/ET5fs2i9VoGW_Png5tE1T.png)

![shenhe](https://github.com/user-attachments/assets/34159126-3058-4078-a101-fdb22839d1f0)


# Shenhe Use Regional Flux Pipeline README (Draw Shenhe in custom rectangle region)

This README provides a guide on how to use the Regional Flux Pipeline, a powerful tool for generating images with regional control using PyTorch. The pipeline allows you to specify different prompts for different regions of the image, enabling fine-grained control over the generated content.

## Table of Contents

- [Installation](#installation)
- [Usage](#usage)
  - [Step 1: Load the Pipeline](#step-1-load-the-pipeline)
  - [Step 2: Configure Attention Processors](#step-2-configure-attention-processors)
  - [Step 3: Set General Settings](#step-3-set-general-settings)
  - [Step 4: Define Regional Prompts and Masks](#step-4-define-regional-prompts-and-masks)
  - [Step 5: Configure Region Control Factors](#step-5-configure-region-control-factors)
  - [Step 6: Generate the Image](#step-6-generate-the-image)
  - [Step 7: Display the Image](#step-7-display-the-image)
  - [Step 8: Draw a Transparent Rectangle](#step-8-draw-a-transparent-rectangle)
- [Chinese Translations](#chinese-translations)

## Installation

### Create a New Conda Environment

```bash
conda create --name py310 python=3.10 && conda activate py310 && pip install ipykernel && python -m ipykernel install --user --name py310 --display-name "py310"
```

### Install Dependencies

We use a specific commit from the `diffusers` repository to ensure reproducibility, as newer versions may produce different results.

```bash
sudo apt-get update && sudo apt-get install git-lfs ffmpeg cbm
```

```bash
# Install diffusers locally
git clone https://github.com/huggingface/diffusers.git
cd diffusers

# Reset diffusers version to 0.31.dev
git reset --hard d13b0d63c0208f2c4c078c4261caf8bf587beb3b
pip install -e ".[torch]"
cd ..

# Install other dependencies
pip install -U transformers sentencepiece protobuf PEFT

# Clone this repo
git clone https://github.com/svjack/Regional-Prompting-FLUX

# Replace file in diffusers
cd Regional-Prompting-FLUX
cp transformer_flux.py ../diffusers/src/diffusers/models/transformers/transformer_flux.py
huggingface-cli login
```

## Usage

### Step 1: Load the Pipeline

First, load the Regional Flux Pipeline from a pretrained model and set the desired data type:

```python
import torch
from pipeline_flux_regional import RegionalFluxPipeline, RegionalFluxAttnProcessor2_0

pipeline = RegionalFluxPipeline.from_pretrained("black-forest-labs/FLUX.1-dev", torch_dtype=torch.bfloat16)
pipeline.load_lora_weights("svjack/FLUX_Shenhe_Lora")
pipeline.to("cuda")
```

### Step 2: Configure Attention Processors

Next, configure the attention processors to use the `RegionalFluxAttnProcessor2_0` for specific attention layers:

```python
attn_procs = {}
for name in pipeline.transformer.attn_processors.keys():
    if 'transformer_blocks' in name and name.endswith("attn.processor"):
        attn_procs[name] = RegionalFluxAttnProcessor2_0()
    else:
        attn_procs[name] = pipeline.transformer.attn_processors[name]
pipeline.transformer.set_attn_processor(attn_procs)
```

### Step 3: Set General Settings

Define the general settings for the image generation:

```python
image_width = 1280
image_height = 768
num_inference_steps = 24
seed = 124

base_prompt = "A snowy chinese hill in the background, A big sun rises."
background_prompt = "a photo of a snowy chinese hill"
```

### Step 4: Define Regional Prompts and Masks

Specify the regional prompts and corresponding masks for different parts of the image:

```python
regional_prompt_mask_pairs = {
    "0": {
        "description": "A dignified woman stands in the foreground, her sliver hair and long braid adorned with a hair ornament, her face illuminated by the cold light of the snow. Her expression is one of determination and sorrow, her clothing and appearance reflecting the historical period. The snow casts a serene yet dramatic light across her features, its cold embrace enveloping her in a world of ice and frost. tj_sthenhe, hair ornament, sliver hair, long hair, braid.",
        "mask": [128, 128, 640, 768]
    }
}
```

### Step 5: Configure Region Control Factors

Set the control factors for region-specific attention injection:

```python
mask_inject_steps = 10
double_inject_blocks_interval = 1
single_inject_blocks_interval = 1
base_ratio = 0.2
```

### Step 6: Generate the Image

Generate the image using the specified prompts and masks:

```python
regional_prompts = []
regional_masks = []
background_mask = torch.ones((image_height, image_width))

for region_idx, region in regional_prompt_mask_pairs.items():
    description = region['description']
    mask = region['mask']
    x1, y1, x2, y2 = mask
    mask = torch.zeros((image_height, image_width))
    mask[y1:y2, x1:x2] = 1.0
    background_mask -= mask
    regional_prompts.append(description)
    regional_masks.append(mask)

if background_mask.sum() > 0:
    regional_prompts.append(background_prompt)
    regional_masks.append(background_mask)

image = pipeline(
    prompt=base_prompt,
    width=image_width, height=image_height,
    mask_inject_steps=mask_inject_steps,
    num_inference_steps=num_inference_steps,
    generator=torch.Generator("cuda").manual_seed(seed),
    joint_attention_kwargs={
        "regional_prompts": regional_prompts,
        "regional_masks": regional_masks,
        "double_inject_blocks_interval": double_inject_blocks_interval,
        "single_inject_blocks_interval": single_inject_blocks_interval,
        "base_ratio": base_ratio
    },
).images[0]

image.save(f"shenhe_in_snow_hill.jpg")
```

### Step 7: Display the Image

Display the generated image:

```python
from IPython import display
display.Image("shenhe_in_snow_hill.jpg", width=512, height=512)
```

![shenhe_in_snow_hill](https://github.com/user-attachments/assets/8edfb639-a624-4218-845d-b8579b41c62a)

### Step 8: Draw a Transparent Rectangle

Optionally, draw a transparent rectangle on the generated image to highlight a specific region:

```python
from PIL import Image, ImageDraw

def draw_transparent_rectangle(image_path, bbox, color, alpha=128, output_path=None):
    """
    在指定区域绘制一个半透明的矩形,并将修改后的图片保存到本地新路径。

    :param image_path: 图片路径
    :param bbox: 长度为4的列表,表示矩形的边界框 [x1, y1, x2, y2]
    :param color: 颜色,格式为 (R, G, B)
    :param alpha: 透明度,范围为 0(完全透明)到 255(完全不透明),默认值为 128
    :param output_path: 保存修改后图片的路径,如果为 None,则覆盖原图
    :return: 修改后的图片对象
    """
    image = Image.open(image_path).convert("RGBA")
    overlay = Image.new('RGBA', image.size, (0, 0, 0, 0))
    draw = ImageDraw.Draw(overlay)

    x1, y1, x2, y2 = bbox
    draw.rectangle([x1, y1, x2, y2], fill=(*color, alpha))

    image = Image.alpha_composite(image, overlay)

    if output_path is None:
        output_path = image_path

    image.save(output_path)
    return image

draw_transparent_rectangle("shenhe_in_snow_hill.jpg", [128, 128, 640, 768], (255, 0, 0), alpha=128, output_path="shenhe_in_snow_hill_rec.png")
display.Image("shenhe_in_snow_hill_rec.png", width=512, height=512)
```

![shenhe_in_snow_hill_rec](https://github.com/user-attachments/assets/64914e05-6cc5-4905-92e1-8ad112561e28)

## Chinese Translations

- `base_prompt`: "背景是雪中的中国山丘,一轮大太阳正在升起。"
- `background_prompt`: "一张雪中的中国山丘的照片"

`regional_prompt_mask_pairs` 中的内容翻译如下:

```json
{
    "0": {
        "description": "一位端庄的女子站在前景中,她的银发和长辫子上装饰着发饰,她的脸被雪的冷光照亮。她的表情既坚定又悲伤,她的服装和外貌反映了历史时期。雪花在她脸上投下宁静而戏剧性的光线,它的寒冷拥抱将她包裹在冰雪世界中。tj_sthenhe,发饰,银发,长发,辫子。",
        "mask": [128, 128, 640, 768]
    }
}
```