gptleaks (Page 26)

Feb 10, 2025 • 6 min read

The Evolution of Artificial Intelligence: Current Trends and Future Implications

weekly news about ai

Feb 10, 2025 • 7 min read

Securing Large Language Models: Strategies and Challenges Explained

weekly news about llm security

Feb 7, 2025 • 1 min read

Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment

arxiv papers

Feb 7, 2025 • 1 min read

"Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence

arxiv papers

Feb 7, 2025 • 1 min read

Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions

arxiv papers

Feb 6, 2025 • 1 min read

Understanding and Enhancing the Transferability of Jailbreaking Attacks

arxiv papers

Feb 6, 2025 • 1 min read

FACTER: Fairness-Aware Conformal Thresholding and Prompt Engineering for Enabling Fair LLM-Based Recommender Systems

arxiv papers

Feb 5, 2025 • 1 min read

PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling

arxiv papers

Feb 5, 2025 • 1 min read

STAIR: Improving Safety Alignment with Introspective Reasoning

arxiv papers

Feb 3, 2025 • 5 min read

The Evolution of AI: Advancements, Ethics, and Implications

weekly news about ai

Feb 3, 2025 • 3 min read

Fortifying Large Language Model Security: Risks, Vulnerabilities, and Defenses

weekly news about llm security

Jan 31, 2025 • 1 min read

Jailbreaking LLMs' Safeguard with Universal Magic Words for Text Embedding Models

arxiv papers