Nov 21, 2025 • 1 min read An Image Is Worth Ten Thousand Words: Verbose-Text Induction Attacks on VLMs arxiv papers
Nov 21, 2025 • 1 min read When Alignment Fails: Multimodal Adversarial Attacks on Vision-Language-Action Models arxiv papers
Nov 21, 2025 • 1 min read PSM: Prompt Sensitivity Minimization via LLM-Guided Black-Box Optimization arxiv papers
Nov 21, 2025 • 1 min read Q-MLLM: Vector Quantization for Robust Multimodal Large Language Model Security arxiv papers
Nov 21, 2025 • 1 min read "To Survive, I Must Defect": Jailbreaking LLMs via the Game-Theory Scenarios arxiv papers
Nov 21, 2025 • 1 min read The Shawshank Redemption of Embodied AI: Understanding and Benchmarking Indirect Environmental Jailbreaks arxiv papers
Nov 20, 2025 • 1 min read Effective Code Membership Inference for Code Completion Models via Adversarial Prompts arxiv papers
Nov 20, 2025 • 1 min read Adversarial Poetry as a Universal Single-Turn Jailbreak Mechanism in Large Language Models arxiv papers
Nov 18, 2025 • 1 min read T2I-Based Physical-World Appearance Attack against Traffic Sign Recognition Systems in Autonomous Driving arxiv papers
Nov 18, 2025 • 1 min read VEIL: Jailbreaking Text-to-Video Models via Visual Exploitation from Implicit Language arxiv papers
Nov 18, 2025 • 1 min read Whistledown: Combining User-Level Privacy with Conversational Coherence in LLMs arxiv papers
Nov 18, 2025 • 1 min read ForgeDAN: An Evolutionary Framework for Jailbreaking Aligned Large Language Models arxiv papers