Spaces:
Runtime error
Runtime error
Update README.md
Browse files
README.md
CHANGED
|
@@ -9,3 +9,92 @@ app_file: app.py
|
|
| 9 |
pinned: false
|
| 10 |
license: apache-2.0
|
| 11 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 9 |
pinned: false
|
| 10 |
license: apache-2.0
|
| 11 |
---
|
| 12 |
+
## PuLID for FLUX: Portrait-Guided Image Generation
|
| 13 |
+
|
| 14 |
+
This code implements **PuLID (Pure and Lightning ID customization)** for FLUX.1-dev, an advanced image generation system that allows users to create personalized images using ID (identity) images as guidance. The system combines the power of FLUX diffusion models with identity preservation capabilities.
|
| 15 |
+
|
| 16 |
+
### Key Features
|
| 17 |
+
|
| 18 |
+
**1. Identity-Guided Generation**
|
| 19 |
+
- Upload an ID image (portrait photo) to guide the generation process
|
| 20 |
+
- Control identity strength with adjustable ID weight (0.0-3.0)
|
| 21 |
+
- Preserve facial features while applying various artistic styles
|
| 22 |
+
|
| 23 |
+
**2. Advanced Configuration Options**
|
| 24 |
+
- **Resolution Control**: Adjustable width (256-1536px) and height (256-1536px)
|
| 25 |
+
- **Generation Steps**: 1-20 steps for quality vs speed tradeoff
|
| 26 |
+
- **Guidance Scale**: Fine-tune adherence to prompts (1.0-10.0)
|
| 27 |
+
- **Seed Control**: Reproducible results with manual seed input
|
| 28 |
+
|
| 29 |
+
**3. True CFG (Classifier-Free Guidance)**
|
| 30 |
+
- Fake CFG mode (scale=1): Faster generation with basic guidance
|
| 31 |
+
- True CFG mode (scale>1): Enhanced quality with negative prompt support
|
| 32 |
+
- Configurable timestep for CFG activation
|
| 33 |
+
|
| 34 |
+
**4. Technical Architecture**
|
| 35 |
+
- Built on FLUX.1-dev diffusion model
|
| 36 |
+
- Utilizes T5 text encoder for prompt understanding
|
| 37 |
+
- CLIP model for image-text alignment
|
| 38 |
+
- Autoencoder for latent space operations
|
| 39 |
+
- GPU acceleration with CUDA support
|
| 40 |
+
|
| 41 |
+
### How It Works
|
| 42 |
+
|
| 43 |
+
1. **Text Prompt Input**: Describe the desired image style (e.g., "portrait, pixar")
|
| 44 |
+
2. **ID Image Upload**: Provide a reference portrait for identity guidance
|
| 45 |
+
3. **Parameter Tuning**: Adjust generation settings for optimal results
|
| 46 |
+
4. **Image Generation**: The model creates an image matching the prompt while preserving the identity
|
| 47 |
+
|
| 48 |
+
### Example Use Cases
|
| 49 |
+
- Transform portraits into different artistic styles (ice sculpture, pixar animation)
|
| 50 |
+
- Create personalized avatars maintaining facial identity
|
| 51 |
+
- Generate creative variations of portraits with text prompts
|
| 52 |
+
- Produce consistent character designs across different scenarios
|
| 53 |
+
|
| 54 |
+
The system leverages Gradio for an intuitive web interface, making advanced AI image generation accessible to users without technical expertise.
|
| 55 |
+
|
| 56 |
+
---
|
| 57 |
+
|
| 58 |
+
## PuLID for FLUX: ์ธ๋ฌผ ๊ธฐ๋ฐ ์ด๋ฏธ์ง ์์ฑ ์์คํ
|
| 59 |
+
|
| 60 |
+
์ด ์ฝ๋๋ FLUX.1-dev๋ฅผ ์ํ **PuLID (Pure and Lightning ID customization)** ์์คํ
์ ๊ตฌํํ ๊ฒ์ผ๋ก, ID(์ ์) ์ด๋ฏธ์ง๋ฅผ ๊ฐ์ด๋๋ก ์ฌ์ฉํ์ฌ ๊ฐ์ธํ๋ ์ด๋ฏธ์ง๋ฅผ ์์ฑํ ์ ์๋ ๊ณ ๊ธ ์ด๋ฏธ์ง ์์ฑ ์์คํ
์
๋๋ค. FLUX ํ์ฐ ๋ชจ๋ธ์ ๊ฐ๋ ฅํ ์ฑ๋ฅ๊ณผ ์ ์ ๋ณด์กด ๊ธฐ๋ฅ์ ๊ฒฐํฉํ์์ต๋๋ค.
|
| 61 |
+
|
| 62 |
+
### ์ฃผ์ ๊ธฐ๋ฅ
|
| 63 |
+
|
| 64 |
+
**1. ์ ์ ๊ธฐ๋ฐ ์ด๋ฏธ์ง ์์ฑ**
|
| 65 |
+
- ID ์ด๋ฏธ์ง(์ธ๋ฌผ ์ฌ์ง)๋ฅผ ์
๋ก๋ํ์ฌ ์์ฑ ๊ณผ์ ๊ฐ์ด๋
|
| 66 |
+
- ์กฐ์ ๊ฐ๋ฅํ ID ๊ฐ์ค์น(0.0-3.0)๋ก ์ ์ ๊ฐ๋ ์ ์ด
|
| 67 |
+
- ๋ค์ํ ์์ ์ ์คํ์ผ์ ์ ์ฉํ๋ฉด์๋ ์ผ๊ตด ํน์ง ๋ณด์กด
|
| 68 |
+
|
| 69 |
+
**2. ๊ณ ๊ธ ์ค์ ์ต์
**
|
| 70 |
+
- **ํด์๋ ์ ์ด**: ๋๋น(256-1536px)์ ๋์ด(256-1536px) ์กฐ์ ๊ฐ๋ฅ
|
| 71 |
+
- **์์ฑ ๋จ๊ณ**: ํ์ง ๋ ์๋ ๊ท ํ์ ์ํ 1-20๋จ๊ณ ์ค์
|
| 72 |
+
- **๊ฐ์ด๋์ค ์ค์ผ์ผ**: ํ๋กฌํํธ ์ค์๋ ๋ฏธ์ธ ์กฐ์ (1.0-10.0)
|
| 73 |
+
- **์๋ ์ ์ด**: ์๋ ์๋ ์
๋ ฅ์ผ๋ก ์ฌํ ๊ฐ๋ฅํ ๊ฒฐ๊ณผ ์์ฑ
|
| 74 |
+
|
| 75 |
+
**3. True CFG (Classifier-Free Guidance)**
|
| 76 |
+
- Fake CFG ๋ชจ๋(scale=1): ๊ธฐ๋ณธ ๊ฐ์ด๋์ค๋ก ๋น ๋ฅธ ์์ฑ
|
| 77 |
+
- True CFG ๋ชจ๋(scale>1): ๋ถ์ ํ๋กฌํํธ ์ง์์ผ๋ก ํฅ์๋ ํ์ง
|
| 78 |
+
- CFG ํ์ฑํ ์์ ์ค์ ๊ฐ๋ฅ
|
| 79 |
+
|
| 80 |
+
**4. ๊ธฐ์ ์ ๊ตฌ์กฐ**
|
| 81 |
+
- FLUX.1-dev ํ์ฐ ๋ชจ๋ธ ๊ธฐ๋ฐ
|
| 82 |
+
- T5 ํ
์คํธ ์ธ์ฝ๋๋ก ํ๋กฌํํธ ์ดํด
|
| 83 |
+
- CLIP ๋ชจ๋ธ๋ก ์ด๋ฏธ์ง-ํ
์คํธ ์ ๋ ฌ
|
| 84 |
+
- ์ ์ฌ ๊ณต๊ฐ ์์
์ ์ํ ์คํ ์ธ์ฝ๋
|
| 85 |
+
- CUDA ์ง์ GPU ๊ฐ์
|
| 86 |
+
|
| 87 |
+
### ์๋ ๋ฐฉ์
|
| 88 |
+
|
| 89 |
+
1. **ํ
์คํธ ํ๋กฌํํธ ์
๋ ฅ**: ์ํ๋ ์ด๋ฏธ์ง ์คํ์ผ ์ค๋ช
(์: "portrait, pixar")
|
| 90 |
+
2. **ID ์ด๋ฏธ์ง ์
๋ก๋**: ์ ์ ๊ฐ์ด๋๋ฅผ ์ํ ์ฐธ์กฐ ์ธ๋ฌผ ์ฌ์ง ์ ๊ณต
|
| 91 |
+
3. **๋งค๊ฐ๋ณ์ ์กฐ์ **: ์ต์ ์ ๊ฒฐ๊ณผ๋ฅผ ์ํ ์์ฑ ์ค์ ์กฐ์
|
| 92 |
+
4. **์ด๋ฏธ์ง ์์ฑ**: ๋ชจ๋ธ์ด ์ ์์ ๋ณด์กดํ๋ฉด์ ํ๋กฌํํธ์ ๋ง๋ ์ด๋ฏธ์ง ์์ฑ
|
| 93 |
+
|
| 94 |
+
### ํ์ฉ ์์
|
| 95 |
+
- ์ธ๋ฌผ ์ฌ์ง์ ๋ค์ํ ์์ ์คํ์ผ๋ก ๋ณํ (์ผ์ ์กฐ๊ฐ, ํฝ์ฌ ์ ๋๋ฉ์ด์
)
|
| 96 |
+
- ์ผ๊ตด ์ ์์ ์ ์งํ ๊ฐ์ธํ๋ ์๋ฐํ ์์ฑ
|
| 97 |
+
- ํ
์คํธ ํ๋กฌํํธ๋ก ์ธ๋ฌผ์ ์ฐฝ์์ ์ธ ๋ณํ ์์ฑ
|
| 98 |
+
- ๋ค์ํ ์๋๋ฆฌ์ค์์ ์ผ๊ด๋ ์บ๋ฆญํฐ ๋์์ธ ์ ์
|
| 99 |
+
|
| 100 |
+
์ด ์์คํ
์ Gradio๋ฅผ ํ์ฉํ ์ง๊ด์ ์ธ ์น ์ธํฐํ์ด์ค๋ฅผ ์ ๊ณตํ์ฌ, ๊ธฐ์ ์ ์ ๋ฌธ ์ง์์ด ์๋ ์ฌ์ฉ์๋ ๊ณ ๊ธ AI ์ด๋ฏธ์ง ์์ฑ ๊ธฐ๋ฅ์ ์ฝ๊ฒ ์ด์ฉํ ์ ์์ต๋๋ค.
|