Skip to content
arxiv papers 1 min read

AudioJailbreak: Jailbreak Attacks against End-to-End Large Audio-Language Models

Link: http://arxiv.org/abs/2505.14103v1

PDF Link: http://arxiv.org/pdf/2505.14103v1

Summary: Jailbreak attacks to Large audio-language models (LALMs) are studiedrecently, but they achieve suboptimal effectiveness, applicability, andpracticability, particularly, assuming that the adversary can fully manipulateuser prompts.

In this work, we first conduct an extensive experiment showingthat advanced text jailbreak attacks cannot be easily ported to end-to-endLALMs via text-to speech (TTS) techniques.

We then propose AudioJailbreak, anovel audio jailbreak attack, featuring (1) asynchrony: the jailbreak audiodoes not need to align with user prompts in the time axis by crafting suffixaljailbreak audios; (2) universality: a single jailbreak perturbation iseffective for different prompts by incorporating multiple prompts intoperturbation generation; (3) stealthiness: the malicious intent of jailbreakaudios will not raise the awareness of victims by proposing various intentconcealment strategies; and (4) over-the-air robustness: the jailbreak audiosremain effective when being played over the air by incorporating thereverberation distortion effect with room impulse response into the generationof the perturbations.

In contrast, all prior audio jailbreak attacks cannotoffer asynchrony, universality, stealthiness, or over-the-air robustness.

Moreover, AudioJailbreak is also applicable to the adversary who cannot fullymanipulate user prompts, thus has a much broader attack scenario.

Extensiveexperiments with thus far the most LALMs demonstrate the high effectiveness ofAudioJailbreak.

We highlight that our work peeks into the security implicationsof audio jailbreak attacks against LALMs, and realistically fosters improvingtheir security robustness.

The implementation and audio samples are availableat our website https://audiojailbreak.

github.

io/AudioJailbreak.

Published on arXiv on: 2025-05-20T09:10:45Z