Skip to content
arxiv papers 1 min read

SafeProtein: Red-Teaming Framework and Benchmark for Protein Foundation Models

Link: http://arxiv.org/abs/2509.03487v1

PDF Link: http://arxiv.org/pdf/2509.03487v1

Summary: Proteins play crucial roles in almost all biological processes.

Theadvancement of deep learning has greatly accelerated the development of proteinfoundation models, leading to significant successes in protein understandingand design.

However, the lack of systematic red-teaming for these models hasraised serious concerns about their potential misuse, such as generatingproteins with biological safety risks.

This paper introduces SafeProtein, thefirst red-teaming framework designed for protein foundation models to the bestof our knowledge.

SafeProtein combines multimodal prompt engineering andheuristic beam search to systematically design red-teaming methods and conducttests on protein foundation models.

We also curated SafeProtein-Bench, whichincludes a manually constructed red-teaming benchmark dataset and acomprehensive evaluation protocol.

SafeProtein achieved continuous jailbreakson state-of-the-art protein foundation models (up to 70% attack success ratefor ESM3), revealing potential biological safety risks in current proteinfoundation models and providing insights for the development of robust securityprotection technologies for frontier models.

The codes will be made publiclyavailable at https://github.

com/jigang-fan/SafeProtein.

Published on arXiv on: 2025-09-03T17:13:56Z