Apr 22, 2025 • 1 min read RainbowPlus: Enhancing Adversarial Prompt Generation via Evolutionary Quality-Diversity Search arxiv papers
Apr 22, 2025 • 1 min read MR. Guard: Multilingual Reasoning Guardrail using Curriculum Learning arxiv papers
Apr 21, 2025 • 7 min read Email Title: Exploring the Future Impact of Artificial Intelligence weekly news about ai
Apr 21, 2025 • 5 min read The Imperative of Security in Large Language Models: Exploring Trends, Vulnerabilities, and Industry Responses. weekly news about llm security
Apr 18, 2025 • 1 min read ZeroSumEval: Scaling LLM Evaluation with Inter-Model Competition arxiv papers
Apr 18, 2025 • 1 min read GraphAttack: Exploiting Representational Blindspots in LLM Safety Mechanisms arxiv papers
Apr 16, 2025 • 1 min read Token-Level Constraint Boundary Search for Jailbreaking Text-to-Image Models arxiv papers
Apr 16, 2025 • 1 min read Bypassing Prompt Injection and Jailbreak Detection in LLM Guardrails arxiv papers
Apr 15, 2025 • 1 min read RealSafe-R1: Safety-Aligned DeepSeek-R1 without Compromising Reasoning Capability arxiv papers
Apr 15, 2025 • 1 min read LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks arxiv papers
Apr 14, 2025 • 8 min read Advancements in AI Technology: Google's Impact and Future Implications weekly news about ai
Apr 14, 2025 • 7 min read Securing Large Language Models: Addressing Vulnerabilities and Compliance Challenges weekly news about llm security