Image-Text-to-Text
PEFT
Safetensors
File size: 1,932 Bytes
b99749a
c848260
 
 
f96ccaf
 
b99749a
 
99bb77f
 
c848260
d3e10b0
c3f2811
d3e10b0
 
bf5e04b
c848260
 
 
 
 
 
 
 
 
99bb77f
 
ea78f78
 
 
95a6e79
 
c848260
 
 
 
 
 
 
 
 
 
ad38b90
afffd3a
ad38b90
 
 
916f0f9
c848260
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
license: other
license_name: sample-code-license
license_link: LICENSE
library_name: peft
pipeline_tag: image-text-to-text
---

# ViPer: Visual Personalization of Generative Models via Individual Preference Learning

*Tuning-free framework for personalized image generation*

  [`Website`](https://viper.epfl.ch) | [`Paper`](https://arxiv.org/abs/2407.17365) | [`GitHub`](https://github.com/EPFL-VILAB/ViPer) | [`BibTeX`](#citation)  


We introduce **ViPer**, a method that personalizes the output of generative models to align with different users’ visual preferences for the same prompt. This is done via a one-time capture of the user’s general preferences and conditioning the generative model on them without the need for engineering detailed prompts.


## Installation
For install instructions, please see https://github.com/EPFL-VILAB/ViPer.


## Usage

This model can be loaded from Hugging Face Hub as follows:

```python
from transformers import AutoModelForVision2Seq
from peft import PeftModel

model = AutoModelForVision2Seq.from_pretrained("HuggingFaceM4/idefics2-8b")
model = PeftModel.from_pretrained(model, "EPFL-VILAB/VPE-ViPer")
```

Please see https://github.com/EPFL-VILAB/ViPer for more detailed instructions.

For more examples and interactive demos, please see our [`website`](https://viper.epfl.ch/) and [`Hugging Face Space`](https://huggingface.co/spaces/EPFL-VILAB/ViPer).

## Citation

If you find this repository helpful, please consider citing our work:
```
@article{ViPer,
  title={{ViPer}: Visual Personalization of Generative Models via Individual Preference Learning},
  author={Sogand Salehi and Mahdi Shafiei and Teresa Yeo and Roman Bachmann and Amir Zamir},
  journal={arXiv preprint arXiv:2407.17365},
  year={2024},
}
```

## License

Licensed under the Apache License, Version 2.0. See [LICENSE](https://github.com/sogandstorme/ViPer_Personalization/blob/main/LICENSE) for details.