kfirgold99 commited on
Commit
9105a9f
·
verified ·
1 Parent(s): 4bc3649

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +253 -3
README.md CHANGED
@@ -1,3 +1,253 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Piece it Together: Part-Based Concepting with IP-Priors
2
+ > Elad Richardson, Kfir Goldberg, Yuval Alaluf, Daniel Cohen-Or
3
+ > Tel Aviv University, Bria AI
4
+ >
5
+ > Advanced generative models excel at synthesizing images but often rely on text-based conditioning. Visual designers, however, often work beyond language, directly drawing inspiration from existing visual elements. In many cases, these elements represent only fragments of a potential concept-such as an uniquely structured wing, or a specific hairstyle-serving as inspiration for the artist to explore how they can come together creatively into a coherent whole. Recognizing this need, we introduce a generative framework that seamlessly integrates a partial set of user-provided visual components into a coherent composition while simultaneously sampling the missing parts needed to generate a plausible and complete concept. Our approach builds on a strong and underexplored representation space, extracted from IP-Adapter+, on which we train IP-Prior, a lightweight flow-matching model that synthesizes coherent compositions based on domain-specific priors, enabling diverse and context-aware generations. Additionally, we present a LoRA-based fine-tuning strategy that significantly improves prompt adherence in IP-Adapter+ for a given task, addressing its common trade-off between reconstruction quality and prompt adherence.
6
+
7
+ <a href="https://arxiv.org/abs/2503.10365"><img src="https://img.shields.io/badge/arXiv-2503.10365-b31b1b.svg" height=20.5></a>
8
+ <a href="https://eladrich.github.io/PiT/"><img src="https://img.shields.io/static/v1?label=Project&message=Website&color=red" height=20.5></a>
9
+
10
+
11
+
12
+
13
+ <p align="center">
14
+ <img src="https://eladrich.github.io/PiT/static/figures/teaser.jpg" width="800px"/>
15
+ <br>
16
+ Using a dedicated prior for the target domain, our method, Piece it Together (PiT), effectively completes missing information by seamlessly integrating given elements into a coherent composition while adding the necessary missing pieces needed for the complete concept to reside in the prior domain.
17
+ </p>
18
+
19
+ ## Description :scroll:
20
+ Official implementation of the paper "Piece it Together: Part-Based Concepting with IP-Priors"
21
+
22
+
23
+ ## Table of contents
24
+ - [Piece it Together: Part-Based Concepting with IP-Priors](#piece-it-together-part-based-concepting-with-ip-priors)
25
+ - [Description :scroll:](#description-scroll)
26
+ - [Table of contents](#table-of-contents)
27
+ - [Getting started with PiT :rocket:](#getting-started-with-pit-rocket)
28
+ - [Setup your environment](#setup-your-environment)
29
+ - [Inference with PiT](#inference-with-pit)
30
+ - [Training PiT](#training-pit)
31
+ - [Inference with IP-LoRA](#inference-with-ip-lora)
32
+ - [Training IP-LoRA](#training-ip-lora)
33
+ - [Preparing your data](#preparing-your-data)
34
+ - [Running the training script](#running-the-training-script)
35
+ - [Exploring the IP+ space](#exploring-the-ip-space)
36
+ - [Finding new directions](#finding-new-directions)
37
+ - [Editing images with found directions](#editing-images-with-found-directions)
38
+ - [Acknowledgments](#acknowledgments)
39
+ - [Citation](#citation)
40
+
41
+
42
+
43
+ ## Getting started with PiT :rocket:
44
+
45
+ ### Setup your environment
46
+
47
+ 1. Clone the repo:
48
+
49
+ ```bash
50
+ git clone https://github.com/eladrich/PiT
51
+ cd PiT
52
+ ```
53
+
54
+ 2. Install `uv`:
55
+
56
+ Instructions taken from [here](https://docs.astral.sh/uv/getting-started/installation/).
57
+
58
+ For linux systems this should be:
59
+ ```bash
60
+ curl -LsSf https://astral.sh/uv/install.sh | sh
61
+ source $HOME/.local/bin/env
62
+ ```
63
+
64
+ 3. Install the dependencies:
65
+
66
+ ```bash
67
+ uv sync
68
+ ```
69
+
70
+ 4. Activate your `.venv` and set the Python env:
71
+
72
+ ```bash
73
+ source .venv/bin/activate
74
+ export PYTHONPATH=${PYTHONPATH}:${PWD}
75
+ ```
76
+
77
+
78
+
79
+ ## Inference with PiT
80
+ | Domain | Examples | Link |
81
+ |--------|--------------|----------------------------------------------------------------------------------------------|
82
+ | Characters | <img src="https://eladrich.github.io/PiT/static/figures/model_results/results_creatures.png" width="400px"/> | [Here](https://huggingface.co/kfirgold99/Piece-it-Together/tree/main/models/characters_ckpt) |
83
+ | Products | <img src="https://eladrich.github.io/PiT/static/figures/model_results/results_products.png" width="400px"/> | [Here](https://huggingface.co/kfirgold99/Piece-it-Together/tree/main/models/products_ckpt) |
84
+ | Toys | <img src="https://eladrich.github.io/PiT/static/figures/model_results/results_toys.png" width="400px"/> | [Here](https://huggingface.co/kfirgold99/Piece-it-Together/tree/main/models/plush_ckpt) |
85
+
86
+
87
+ ## Training PiT
88
+
89
+ ### Data Generation
90
+ PiT assumes that the data is structured so that the the target images and part images are in the same directory with the naming convention being `image_name.jpg` for hte base image and `image_name_i.jpg` for the parts.
91
+
92
+ To use a generated data see the sample scripts
93
+ ```bash
94
+ python -m scripts.generate_characters
95
+ ```
96
+
97
+ ```bash
98
+ python -m scripts.generate_products
99
+ ```
100
+
101
+ ### Training
102
+
103
+ For training see the `training/coach.py` file and the example below
104
+
105
+ ``bash
106
+ python -m scripts.train --config_path=configs/train/train_characters.yaml
107
+ ``
108
+
109
+ ## PiT Inference
110
+
111
+ For inference see `scripts.infer.py` with the corresponding configs under `configs/infer`
112
+
113
+ ```bash
114
+ python -m scripts.infer --config_path=configs/infer/infer_characters.yaml
115
+ ```
116
+
117
+
118
+ ## Inference with IP-LoRA
119
+
120
+ 1. Download the IP checkpoint and the LoRAs
121
+
122
+ ```bash
123
+ ip_lora_inference/download_ip_adapter.sh
124
+ ip_lora_inference/download_loras.sh
125
+ ```
126
+
127
+ 2. Run inference with your preferred model
128
+
129
+ example for running the styled-generation LoRA
130
+
131
+ ```bash
132
+ python ip_lora_inference/inference_ip_lora.py --lora_type "character_sheet" --lora_path "weights/character_sheet/pytorch_lora_weights.safetensors" --prompt "a character sheet displaying a creature, from several angles with 1 large front view in the middle, clean white background. In the background we can see half-completed, partially colored, sketches of different parts of the object" --output_dir "ip_lora_inference/character_sheet/" --ref_images_paths "assets/character_sheet_default_ref.jpg"
133
+ --ip_adapter_path "weights/ip_adapter/sdxl_models/ip-adapter-plus_sdxl_vit-h.bin"
134
+ ```
135
+
136
+ ## Training IP-LoRA
137
+
138
+ ### Preparing your data
139
+
140
+ The expected data format for the training script is as follows:
141
+
142
+ ```
143
+ --base_dir/
144
+ ----targets/
145
+ ------img1.jpg
146
+ ------img1.txt
147
+ ------img2.jpg
148
+ ------img2.txt
149
+ ------img3.jpg
150
+ ------img3.txt
151
+ .
152
+ .
153
+ .
154
+ ----refs/
155
+ ------img1_ref.jpg
156
+ ------img2_ref.jpg
157
+ ------img3_ref.jpg
158
+ .
159
+ .
160
+ .
161
+ ```
162
+
163
+ Where `imgX.jpg` is the target image for the input reference image `imgX_ref.jpg` with the prompt `imgX.txt`
164
+
165
+ ### Running the training script
166
+
167
+ For training a character-sheet styled generation LoRA, run the following command:
168
+
169
+ ```bash
170
+ python ./ip_lora_train/train_ip_lora.py \
171
+ --rank 64 \
172
+ --resolution 1024 \
173
+ --validation_epochs 1 \
174
+ --num_train_epochs 100 \
175
+ --checkpointing_steps 50 \
176
+ --train_batch_size 2 \
177
+ --learning_rate 1e-4 \
178
+ --dataloader_num_workers 1 \
179
+ --gradient_accumulation_steps 8 \
180
+ --dataset_base_dir <base_dir> \
181
+ --prompt_mode character_sheet \
182
+ --output_dir ./output/train_ip_lora/character_sheet
183
+
184
+ ```
185
+
186
+ and for the text adherence LoRA, run the following command:
187
+
188
+ ```bash
189
+ python ./ip_lora_train/train_ip_lora.py \
190
+ --rank 64 \
191
+ --resolution 1024 \
192
+ --validation_epochs 1 \
193
+ --num_train_epochs 100 \
194
+ --checkpointing_steps 50 \
195
+ --train_batch_size 2 \
196
+ --learning_rate 1e-4 \
197
+ --dataloader_num_workers 1 \
198
+ --gradient_accumulation_steps 8 \
199
+ --dataset_base_dir <base_dir> \
200
+ --prompt_mode creature_in_scene \
201
+ --output_dir ./output/train_ip_lora/creature_in_scene
202
+ ```
203
+
204
+ ## Exploring the IP+ space
205
+
206
+ Start by downloading the needed IP+ checkpoint and the directions presented in the paper:
207
+
208
+ ```bash
209
+ ip_plus_space_exploration/download_directions.sh
210
+ ip_plus_space_exploration/download_ip_adapter.sh
211
+ ```
212
+
213
+ ### Finding new directions
214
+
215
+ To find a direction in the IP+ space from "class1" (e.g. "scrawny") to "class2" (e.g. "muscular"):
216
+
217
+ 1. Create `class1_dir` and `class2_dir` containing images of the source and target classes respectively
218
+
219
+ 2. Run the `find_direction` script:
220
+
221
+ ```bash
222
+ python ip_plus_space_exploration/find_direction.py --class1_dir <path_to_source_class> --class2_dir <path_to_target_class> --output_dir ./ip_directions --ip_model_type "plus"
223
+ ```
224
+
225
+ ### Editing images with found directions
226
+
227
+ Use the direction found in the previous stage, or one downloaded from [HuggingFace](https://huggingface.co/kfirgold99/Piece-it-Together) in the previous stage.
228
+
229
+ ```bash
230
+ python ip_plus_space_exploration/edit_by_direction.py --ip_model_type "plus" --image_path <source_image> --direction_path <path_to_chosen_direction> --direction_type "ip" --output_dir "./edit_by_direction/"
231
+ ```
232
+
233
+ ## Acknowledgments
234
+
235
+ Code is based on
236
+ - https://github.com/pOpsPaper/pOps
237
+ - https://github.com/cloneofsimo/minRF by the great [@cloneofsimo](https://github.com/cloneofsimo)
238
+
239
+ ## Citation
240
+
241
+ If you use this code for your research, please cite the following paper:
242
+
243
+ ```
244
+ @misc{richardson2025piece,
245
+ title={Piece it Together: Part-Based Concepting with IP-Priors},
246
+ author={Richardson, Elad and Goldberg, Kfir and Alaluf, Yuval and Cohen-Or, Daniel},
247
+ year={2025},
248
+ eprint={2503.10365},
249
+ archivePrefix={arXiv},
250
+ primaryClass={cs.CV},
251
+ url={https://arxiv.org/abs/2503.10365},
252
+ }
253
+ ```