Mar 19, 2025 • 1 min read Make the Most of Everything: Further Considerations on Disrupting Diffusion-based Customization arxiv papers
Mar 18, 2025 • 1 min read Evolution-based Region Adversarial Prompt Learning for Robustness Enhancement in Vision-Language Models arxiv papers
Mar 18, 2025 • 1 min read MirrorGuard: Adaptive Defense Against Jailbreaks via Entropy-Guided Mirror Crafting arxiv papers
Mar 17, 2025 • 3 min read The Evolution of AI: Global Trends and Competition Dynamics weekly news about ai
Mar 17, 2025 • 3 min read Securing Large Language Models: Addressing Security Challenges and Solutions weekly news about llm security
Mar 14, 2025 • 1 min read Siege: Autonomous Multi-Turn Jailbreaking of Large Language Models with Tree Search arxiv papers
Mar 13, 2025 • 1 min read JBFuzz: Jailbreaking LLMs Efficiently and Effectively Using Fuzzing arxiv papers
Mar 13, 2025 • 1 min read Probing Latent Subspaces in LLM for AI Security: Identifying and Manipulating Adversarial States arxiv papers
Mar 13, 2025 • 1 min read Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models arxiv papers
Mar 12, 2025 • 1 min read Dialogue Injection Attack: Jailbreaking LLMs through Context Manipulation arxiv papers
Mar 11, 2025 • 1 min read Utilizing Jailbreak Probability to Attack and Safeguard Multimodal LLMs arxiv papers
Mar 11, 2025 • 1 min read TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models arxiv papers