Image-to-Image
Transformers
English
multimodal
frankzeng commited on
Commit
da51930
Β·
verified Β·
1 Parent(s): 60e45a6

Upload folder using huggingface_hub

Browse files
.DS_Store CHANGED
Binary files a/.DS_Store and b/.DS_Store differ
 
.gitattributes CHANGED
@@ -34,3 +34,6 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/image_edit_demo.gif filter=lfs diff=lfs merge=lfs -text
 
 
 
 
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  assets/image_edit_demo.gif filter=lfs diff=lfs merge=lfs -text
37
+ assets/arch.png filter=lfs diff=lfs merge=lfs -text
38
+ assets/eval_res_en.png filter=lfs diff=lfs merge=lfs -text
39
+ assets/results_show.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -1,10 +1,15 @@
1
  ---
2
  license: mit
 
 
 
 
 
 
3
  ---
4
 
5
  ## πŸ”₯πŸ”₯πŸ”₯ News!!
6
- * Apr 25, 2025: πŸ‘‹ We release the evaluation code and benchmark data of Step1X-Edit. [Download GEdit-Bench](https://huggingface.co/datasets/stepfun-ai/GEdit-Bench)
7
- * Apr 25, 2025: πŸ‘‹ We release the inference code and model weights of Step1X-Edit. [Download Step1X-Edit model](https://huggingface.co/stepfun-ai/Step1X-Edit)
8
  * Apr 25, 2025: πŸŽ‰ We have made our technical report available as open source. [Read](https://arxiv.org/abs/2504.17761)
9
 
10
  ## Image Edit Demos
@@ -15,6 +20,23 @@ license: mit
15
  </div>
16
 
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Citation
19
  ```
20
  @article{liu2025step1x-edit,
 
1
  ---
2
  license: mit
3
+ language:
4
+ - en
5
+ pipeline_tag: image-text-to-image
6
+ tags:
7
+ - multimodal
8
+ library_name: transformers
9
  ---
10
 
11
  ## πŸ”₯πŸ”₯πŸ”₯ News!!
12
+ * Apr 25, 2025: πŸ‘‹ We release the inference code and model weights of Step1X-Edit. [inference code](https://github.com/stepfun-ai/Step1X-Edit)
 
13
  * Apr 25, 2025: πŸŽ‰ We have made our technical report available as open source. [Read](https://arxiv.org/abs/2504.17761)
14
 
15
  ## Image Edit Demos
 
20
  </div>
21
 
22
 
23
+ ## Model introduction
24
+ <div align="center">
25
+ <img width="720" alt="demo" src="assets/arch.png">
26
+ </div>
27
+
28
+ Framework of Step1X-Edit. Step1X-Edit leverages the image understanding capabilities
29
+ of MLLMs to parse editing instructions and generate editing tokens, which are then decoded into
30
+ images using a DiT-based network.More details please refer to our [technical report](https://arxiv.org/abs/2504.17761).
31
+
32
+
33
+ ## Benchmark
34
+ We release [GEdit-Bench](https://huggingface.co/datasets/stepfun-ai/GEdit-Bench) as a new benchmark, grounded in real-world usages is developed to support more authentic and comprehensive evaluation. This benchmark, which is carefully curated to reflect actual user editing needs and a wide range of editing scenarios, enables more authentic and comprehensive evaluations of image editing models.
35
+ The evaluation process and related code can be found in [GEdit-Bench/EVAL.md](GEdit-Bench/EVAL.md). Part results of the benchmark are shown below:
36
+ <div align="center">
37
+ <img width="1080" alt="results" src="assets/eval_res_en.png">
38
+ </div>
39
+
40
  ## Citation
41
  ```
42
  @article{liu2025step1x-edit,
assets/arch.png ADDED

Git LFS Details

  • SHA256: e350dd53520acd47e7e615cc624aa8a3268dd8a3f0ba404716b75a6cf5cda16b
  • Pointer size: 131 Bytes
  • Size of remote file: 116 kB
assets/eval_res_en.png ADDED

Git LFS Details

  • SHA256: 12c32cca986228634c543ac6a46e46f83bbd82e826bcfb8d82a5a41276fa1f7d
  • Pointer size: 131 Bytes
  • Size of remote file: 524 kB
assets/results_show.png ADDED

Git LFS Details

  • SHA256: 8ac57118e59a67a60572ad9fce704bc81e2c3378bba47febed0936582e4eb76a
  • Pointer size: 132 Bytes
  • Size of remote file: 2.48 MB