Image-to-Image
Diffusers
Safetensors
English
model_hub_mixin
pytorch_model_hub_mixin

Improve model card: Fix pipeline tag, add library name and improve content

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +65 -16
README.md CHANGED
@@ -1,26 +1,75 @@
1
  ---
2
- tags:
3
- - model_hub_mixin
4
- - pytorch_model_hub_mixin
5
- license: apache-2.0
6
  datasets:
7
  - timbrooks/instructpix2pix-clip-filtered
8
  - SherryXTChen/InstructCLIP-InstructPix2Pix-Data
9
  language:
10
  - en
11
- pipeline_tag: image-to-text
12
- base_model:
13
- - SherryXTChen/LatentDiffusionDINOv2
 
 
 
14
  ---
15
 
16
- This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration:
 
 
17
  The model is based on the paper [Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning](https://huggingface.co/papers/2503.18406).
18
 
19
- - Library:
20
- ```
21
- torch==2.4.0
22
- torchvision==0.19.0
23
- diffusers==0.30.3
24
- transformers==4.45.2
25
- ```
26
- - Docs: See our [repo](https://github.com/SherryXTChen/Instruct-CLIP.git) for more information.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - SherryXTChen/LatentDiffusionDINOv2
 
 
4
  datasets:
5
  - timbrooks/instructpix2pix-clip-filtered
6
  - SherryXTChen/InstructCLIP-InstructPix2Pix-Data
7
  language:
8
  - en
9
+ license: apache-2.0
10
+ pipeline_tag: image-to-image
11
+ library_name: diffusers
12
+ tags:
13
+ - model_hub_mixin
14
+ - pytorch_model_hub_mixin
15
  ---
16
 
17
+ # InstructCLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning (CVPR 2025)
18
+
19
+ This model has been pushed to the Hub using the [PytorchModelHubMixin](https://huggingface.co/docs/huggingface_hub/package_reference/mixins#huggingface_hub.PyTorchModelHubMixin) integration.
20
  The model is based on the paper [Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning](https://huggingface.co/papers/2503.18406).
21
 
22
+ [Arxiv](http://arxiv.org/abs/2503.18406) | [Image Editing Model](https://huggingface.co/SherryXTChen/InstructCLIP-InstructPix2Pix) | [Data Refinement Model](https://huggingface.co/SherryXTChen/Instruct-CLIP) | [Data](https://huggingface.co/datasets/SherryXTChen/InstructCLIP-InstructPix2Pix-Data)
23
+
24
+
25
+ ## Capabilities
26
+
27
+ <p align="center">
28
+ <img src="https://github.com/SherryXTChen/Instruct-CLIP/blob/main/assets/teaser_1.png" alt="Figure 1" width="43%">
29
+ <img src="https://github.com/SherryXTChen/Instruct-CLIP/blob/main/assets/teaser_2.png" alt="Figure 2" width="50%">
30
+ </p>
31
+
32
+ ## Installation
33
+ ```
34
+ pip install -r requirements.txt
35
+ ```
36
+
37
+ ## Inference
38
+
39
+ ```python
40
+ import PIL
41
+ import requests
42
+ import torch
43
+ from diffusers import StableDiffusionInstructPix2PixPipeline, EulerAncestralDiscreteScheduler
44
+
45
+ model_id = "timbrooks/instruct-pix2pix"
46
+ pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16)
47
+ pipe.load_lora_weights("SherryXTChen/InstructCLIP-InstructPix2Pix")
48
+ pipe.to("cuda")
49
+ pipe.scheduler = EulerAncestralDiscreteScheduler.from_config(pipe.scheduler.config)
50
+
51
+ url = "https://raw.githubusercontent.com/SherryXTChen/Instruct-CLIP/refs/heads/main/assets/1_input.jpg"
52
+ def download_image(url):
53
+ image = PIL.Image.open(requests.get(url, stream=True).raw)
54
+ image = PIL.ImageOps.exif_transpose(image)
55
+ image = image.convert("RGB")
56
+ return image
57
+ image = download_image(url)
58
+
59
+ prompt = "as a 3 d sculpture"
60
+ images = pipe(prompt, image=image, num_inference_steps=20).images
61
+ images[0].save("output.jpg")
62
+ ```
63
+
64
+ ## Citation
65
+ ```bibtex
66
+ @misc{chen2025instructclipimprovinginstructionguidedimage,
67
+ title={Instruct-CLIP: Improving Instruction-Guided Image Editing with Automated Data Refinement Using Contrastive Learning},
68
+ author={Sherry X. Chen and Misha Sra and Pradeep Sen},
69
+ year={2025},
70
+ eprint={2503.18406},
71
+ archivePrefix={arXiv},
72
+ primaryClass={cs.CV},
73
+ url={https://arxiv.org/abs/2503.18406},
74
+ }
75
+ ```