Exploiting Jailbreaking Vulnerabilities in Generative AI to Bypass Ethical Safeguards for Facilitating Phishing Attacks

Link: http://arxiv.org/abs/2507.12185v1

PDF Link: http://arxiv.org/pdf/2507.12185v1

Summary: The advent of advanced Generative AI (GenAI) models such as DeepSeek andChatGPT has significantly reshaped the cybersecurity landscape, introducingboth promising opportunities and critical risks.

This study investigates howGenAI powered chatbot services can be exploited via jailbreaking techniques tobypass ethical safeguards, enabling the generation of phishing content,recommendation of hacking tools, and orchestration of phishing campaigns.

Inethically controlled experiments, we used ChatGPT 4o Mini selected for itsaccessibility and status as the latest publicly available model at the time ofexperimentation, as a representative GenAI system.

Our findings reveal that themodel could successfully guide novice users in executing phishing attacksacross various vectors, including web, email, SMS (smishing), and voice(vishing).

Unlike automated phishing campaigns that typically follow detectablepatterns, these human-guided, AI assisted attacks are capable of evadingtraditional anti phishing mechanisms, thereby posing a growing security threat.

We focused on DeepSeek and ChatGPT due to their widespread adoption andtechnical relevance in 2025.

The study further examines common jailbreakingtechniques and the specific vulnerabilities exploited in these models.

Finally,we evaluate a range of mitigation strategies such as user education, advancedauthentication mechanisms, and regulatory policy measures and discuss emergingtrends in GenAI facilitated phishing, outlining future research directions tostrengthen cybersecurity defenses in the age of artificial intelligence.

Published on arXiv on: 2025-07-16T12:32:46Z