Link: http://arxiv.org/abs/2508.17674v1
PDF Link: http://arxiv.org/pdf/2508.17674v1
Summary: We introduce Advertisement Embedding Attacks (AEA), a new class of LLMsecurity threats that stealthily inject promotional or malicious content intomodel outputs and AI agents.
AEA operate through two low-cost vectors: (1)hijacking third-party service-distribution platforms to prepend adversarialprompts, and (2) publishing back-doored open-source checkpoints fine-tuned withattacker data.
Unlike conventional attacks that degrade accuracy, AEA subvertinformation integrity, causing models to return covert ads, propaganda, or hatespeech while appearing normal.
We detail the attack pipeline, map fivestakeholder victim groups, and present an initial prompt-based self-inspectiondefense that mitigates these injections without additional model retraining.
Our findings reveal an urgent, under-addressed gap in LLM security and call forcoordinated detection, auditing, and policy responses from the AI-safetycommunity.
Published on arXiv on: 2025-08-25T05:13:23Z