Jun 4, 2025 • 1 min read Comparative Analysis of AI Agent Architectures for Entity Relationship Classification arxiv papers
Jun 4, 2025 • 1 min read It's the Thought that Counts: Evaluating the Attempts of Frontier LLMs to Persuade on Harmful Topics arxiv papers
Jun 4, 2025 • 1 min read BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage arxiv papers
Jun 2, 2025 • 5 min read The Evolution of Artificial Intelligence: Advancements, Challenges, and Ethical Considerations weekly news about ai
Jun 2, 2025 • 2 min read Increasing LLM Security Measures: Latest Developments and Recommendations weekly news about llm security
May 30, 2025 • 1 min read Fooling the Watchers: Breaking AIGC Detectors via Semantic Prompt Attacks arxiv papers
May 30, 2025 • 1 min read Adaptive Jailbreaking Strategies Based on the Semantic Understanding Capabilities of Large Language Models arxiv papers
May 30, 2025 • 1 min read Understanding Refusal in Language Models with Sparse Autoencoders arxiv papers
May 29, 2025 • 1 min read Adaptive Detoxification: Safeguarding General Capabilities of LLMs through Toxicity-Aware Knowledge Editing arxiv papers
May 29, 2025 • 1 min read Test-Time Immunization: A Universal Defense Framework Against Jailbreaks for (Multimodal) Large Language Models arxiv papers
May 29, 2025 • 1 min read Seeing the Threat: Vulnerabilities in Vision-Language Models to Adversarial Attack arxiv papers