May 27, 2025 • 1 min read What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs arxiv papers
May 27, 2025 • 1 min read Attention! You Vision Language Model Could Be Maliciously Manipulated arxiv papers
May 23, 2025 • 1 min read Three Minds, One Legend: Jailbreak Large Reasoning Model with Adaptive Stacked Ciphers arxiv papers
May 23, 2025 • 1 min read Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models arxiv papers
May 23, 2025 • 1 min read When Safety Detectors Aren't Enough: A Stealthy and Effective Jailbreak Attack on LLMs via Steganographic Techniques arxiv papers
May 23, 2025 • 1 min read Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models arxiv papers
May 23, 2025 • 1 min read MixAT: Combining Continuous and Discrete Adversarial Training for LLMs arxiv papers
May 22, 2025 • 1 min read Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries arxiv papers
May 22, 2025 • 1 min read Audio Jailbreak: An Open Comprehensive Benchmark for Jailbreaking Large Audio-Language Models arxiv papers