arxiv:2508.01151

Personalized Safety Alignment for Text-to-Image Diffusion Models

Published on Aug 2

· Submitted by

BryanW on Aug 5

Upvote

Authors:

Yu Lei ,

Abstract

A personalized safety alignment framework integrates user-specific profiles into text-to-image diffusion models to better align generated content with individual safety preferences.

AI-generated summary

Text-to-image diffusion models have revolutionized visual content generation, but current safety mechanisms apply uniform standards that often fail to account for individual user preferences. These models overlook the diverse safety boundaries shaped by factors like age, mental health, and personal beliefs. To address this, we propose Personalized Safety Alignment (PSA), a framework that allows user-specific control over safety behaviors in generative models. PSA integrates personalized user profiles into the diffusion process, adjusting the model's behavior to match individual safety preferences while preserving image quality. We introduce a new dataset, Sage, which captures user-specific safety preferences and incorporates these profiles through a cross-attention mechanism. Experiments show that PSA outperforms existing methods in harmful content suppression and aligns generated content better with user constraints, achieving higher Win Rate and Pass Rate scores. Our code, data, and models are publicly available at https://torpedo2648.github.io/PSAlign/.

View arXiv page View PDF Project page GitHub 6 Add to collection

Community

BryanW

Paper submitter 9 days ago

🚨 Personalized AI Safety is here!

We introduce PSA — the first user-aware safety alignment for text-to-image generation.

🤖 Today’s AI models apply the same filters to everyone. But users differ — by age, beliefs, or mental health.

So we built a system that:

🧬 Learns your safety preferences from a profile (age, gender, religion, health...)
🔁 Guides generation using cross-attention adapters
📉 Suppresses harmful content only when you find it unsafe

Result? AI that’s safer for you, not just in general.

📊 Outperforms baselines on harmful content erasure and personalization.

📚 Paper: https://arxiv.org/abs/2508.01151
💻 Code: https://github.com/M-E-AGI-Lab/PSAlign
🌐 Project: https://m-e-agi-lab.github.io/PSAlign/

librarian-bot

8 days ago

This is an automated message from the Librarian Bot. I found the following papers similar to this paper.

The following papers were recommended by the Semantic Scholar API

Please give a thumbs up to this comment if you found it helpful!

If you want recommendations for any Paper on Hugging Face checkout this Space

You can directly ask Librarian Bot for paper recommendations by tagging it in a comment: @librarian-bot recommend

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

Upvote

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2508.01151 in a model README.md to link it from this page.

Datasets citing this paper 0

No dataset linking this paper

Cite arxiv.org/abs/2508.01151 in a dataset README.md to link it from this page.

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2508.01151 in a Space README.md to link it from this page.