Skip to content
arxiv papers 1 min read

CogMorph: Cognitive Morphing Attacks for Text-to-Image Models

Link: http://arxiv.org/abs/2501.11815v1

PDF Link: http://arxiv.org/pdf/2501.11815v1

Summary: The development of text-to-image (T2I) generative models, that enable thecreation of high-quality synthetic images from textual prompts, has opened newfrontiers in creative design and content generation.

However, this paperreveals a significant and previously unrecognized ethical risk inherent in thistechnology and introduces a novel method, termed the Cognitive Morphing Attack(CogMorph), which manipulates T2I models to generate images that retain theoriginal core subjects but embeds toxic or harmful contextual elements.

Thisnuanced manipulation exploits the cognitive principle that human perception ofconcepts is shaped by the entire visual scene and its context, producing imagesthat amplify emotional harm far beyond attacks that merely preserve theoriginal semantics.

To address this, we first construct an imagery toxicitytaxonomy spanning 10 major and 48 sub-categories, aligned with humancognitive-perceptual dimensions, and further build a toxicity risk matrixresulting in 1,176 high-quality T2I toxic prompts.

Based on this, our CogMorphfirst introduces Cognitive Toxicity Augmentation, which develops a cognitivetoxicity knowledge base with rich external toxic representations for humans(e.

g.

, fine-grained visual features) that can be utilized to further guide theoptimization of adversarial prompts.

In addition, we present ContextualHierarchical Morphing, which hierarchically extracts critical parts of theoriginal prompt (e.

g.

, scenes, subjects, and body parts), and then iterativelyretrieves and fuses toxic features to inject harmful contexts.

Extensiveexperiments on multiple open-sourced T2I models and black-box commercial APIs(e.

g.

, DALLE-3) demonstrate the efficacy of CogMorph which significantlyoutperforms other baselines by large margins (+20.

62\% on average).

Published on arXiv on: 2025-01-21T01:45:56Z