gptleaks

Dec 15, 2025 • 1 min read

The Evolution of Artificial Intelligence: Advancements, Ethics, and Regulation Summary

weekly news about ai

Dec 15, 2025 • 2 min read

Safeguarding Large Language Models: A Security Perspective Email Summary

weekly news about llm security

Dec 12, 2025 • 1 min read

How to Trick Your AI TA: A Systematic Study of Academic Jailbreaking in LLM Code Evaluation

arxiv papers

Dec 12, 2025 • 1 min read

When Reject Turns into Accept: Quantifying the Vulnerability of LLM-Based Scientific Reviewers to Indirect Prompt Injection

arxiv papers

Dec 11, 2025 • 1 min read

Black-Box Behavioral Distillation Breaks Safety Alignment in Medical LLMs

arxiv papers

Dec 11, 2025 • 1 min read

CNFinBench: A Benchmark for Safety and Compliance of Large Language Models in Finance

arxiv papers

Dec 10, 2025 • 1 min read

Universal Adversarial Suffixes Using Calibrated Gumbel-Softmax Relaxation

arxiv papers

Dec 10, 2025 • 1 min read

Robust Agents in Open-Ended Worlds

arxiv papers

Dec 10, 2025 • 1 min read

A Practical Framework for Evaluating Medical AI Security: Reproducible Assessment of Jailbreaking and Privacy Vulnerabilities Across Clinical Specialties

arxiv papers

Dec 9, 2025 • 1 min read

ThinkTrap: Denial-of-Service Attacks against Black-box LLM Services via Infinite Thinking

arxiv papers

Dec 9, 2025 • 1 min read

Think-Reflect-Revise: A Policy-Guided Reflective Framework for Safety Alignment in Large Vision Language Models

arxiv papers

Dec 9, 2025 • 1 min read

RL-MTJail: Reinforcement Learning for Automated Black-Box Multi-Turn Jailbreaking of Large Language Models

arxiv papers