Feb 10, 2025 • 6 min read The Evolution of Artificial Intelligence: Current Trends and Future Implications weekly news about ai
Feb 10, 2025 • 7 min read Securing Large Language Models: Strategies and Challenges Explained weekly news about llm security
Feb 7, 2025 • 1 min read Leveraging Reasoning with Guidelines to Elicit and Utilize Knowledge for Enhancing Safety Alignment arxiv papers
Feb 7, 2025 • 1 min read "Short-length" Adversarial Training Helps LLMs Defend "Long-length" Jailbreak Attacks: Theoretical and Empirical Evidence arxiv papers
Feb 7, 2025 • 1 min read Speak Easy: Eliciting Harmful Jailbreaks from LLMs with Simple Interactions arxiv papers
Feb 6, 2025 • 1 min read Understanding and Enhancing the Transferability of Jailbreaking Attacks arxiv papers
Feb 6, 2025 • 1 min read FACTER: Fairness-Aware Conformal Thresholding and Prompt Engineering for Enabling Fair LLM-Based Recommender Systems arxiv papers
Feb 5, 2025 • 1 min read PANDAS: Improving Many-shot Jailbreaking via Positive Affirmation, Negative Demonstration, and Adaptive Sampling arxiv papers
Feb 3, 2025 • 5 min read The Evolution of AI: Advancements, Ethics, and Implications weekly news about ai
Feb 3, 2025 • 3 min read Fortifying Large Language Model Security: Risks, Vulnerabilities, and Defenses weekly news about llm security
Jan 31, 2025 • 1 min read Jailbreaking LLMs' Safeguard with Universal Magic Words for Text Embedding Models arxiv papers