Make the Most of Everything: Further Considerations on Disrupting Diffusion-based Customization

Link: http://arxiv.org/abs/2503.13945v1

PDF Link: http://arxiv.org/pdf/2503.13945v1

Summary: The fine-tuning technique for text-to-image diffusion models facilitatesimage customization but risks privacy breaches and opinion manipulation.

Current research focuses on prompt- or image-level adversarial attacks foranti-customization, yet it overlooks the correlation between these two levelsand the relationship between internal modules and inputs.

This hindersanti-customization performance in practical threat scenarios.

We propose DualAnti-Diffusion (DADiff), a two-stage adversarial attack targeting diffusioncustomization, which, for the first time, integrates the adversarialprompt-level attack into the generation process of image-level adversarialexamples.

In stage 1, we generate prompt-level adversarial vectors to guide thesubsequent image-level attack.

In stage 2, besides conducting the end-to-endattack on the UNet model, we disrupt its self- and cross-attention modules,aiming to break the correlations between image pixels and align thecross-attention results computed using instance prompts and adversarial promptvectors within the images.

Furthermore, we introduce a local random timestepgradient ensemble strategy, which updates adversarial perturbations byintegrating random gradients from multiple segmented timesets.

Experimentalresults on various mainstream facial datasets demonstrate 10%-30% improvementsin cross-prompt, keyword mismatch, cross-model, and cross-mechanismanti-customization with DADiff compared to existing methods.

Published on arXiv on: 2025-03-18T06:22:03Z