May 27, 2025 • 1 min read VisCRA: A Visual Chain Reasoning Attack for Jailbreaking Multimodal Large Language Models arxiv papers
May 27, 2025 • 1 min read JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models arxiv papers
May 23, 2025 • 1 min read Invisible Prompts, Visible Threats: Malicious Font Injection in External Resources for Large Language Models arxiv papers
May 23, 2025 • 1 min read When Safety Detectors Aren't Enough: A Stealthy and Effective Jailbreak Attack on LLMs via Steganographic Techniques arxiv papers
May 23, 2025 • 1 min read Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models arxiv papers
May 23, 2025 • 1 min read Three Minds, One Legend: Jailbreak Large Reasoning Model with Adaptive Stacked Ciphers arxiv papers
May 23, 2025 • 1 min read MixAT: Combining Continuous and Discrete Adversarial Training for LLMs arxiv papers
May 22, 2025 • 1 min read Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries arxiv papers
May 22, 2025 • 1 min read Scalable Defense against In-the-wild Jailbreaking Attacks with Safety Context Retrieval arxiv papers