Apr 30, 2025 • 1 min read Inception: Jailbreak the Memory Mechanism of Text-to-Image Generation Systems arxiv papers
Apr 30, 2025 • 1 min read AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security arxiv papers
Apr 29, 2025 • 1 min read JailbreaksOverTime: Detecting Jailbreak Attacks Under Distribution Shift arxiv papers
Apr 24, 2025 • 1 min read Amplified Vulnerabilities: Structured Jailbreak Attacks on LLM-based Multi-Agent Debate arxiv papers
Apr 23, 2025 • 1 min read T2VShield: Model-Agnostic Jailbreak Defense for Text-to-Video Models arxiv papers
Apr 22, 2025 • 1 min read RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search arxiv papers
Apr 22, 2025 • 1 min read MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning arxiv papers
Apr 18, 2025 • 1 min read ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition arxiv papers
Apr 18, 2025 • 1 min read GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms arxiv papers
Apr 16, 2025 • 1 min read Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models arxiv papers
Apr 16, 2025 • 1 min read Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails arxiv papers