Skip to content
arxiv papers 1 min read

Foundation Model Self-Play: Open-Ended Strategy Innovation via Foundation Models

Link: http://arxiv.org/abs/2507.06466v1

PDF Link: http://arxiv.org/pdf/2507.06466v1

Summary: Multi-agent interactions have long fueled innovation, from naturalpredator-prey dynamics to the space race.

Self-play (SP) algorithms try toharness these dynamics by pitting agents against ever-improving opponents,thereby creating an implicit curriculum toward learning high-quality solutions.

However, SP often fails to produce diverse solutions and can get stuck inlocally optimal behaviors.

We introduce Foundation-Model Self-Play (FMSP), anew direction that leverages the code-generation capabilities and vastknowledge of foundation models (FMs) to overcome these challenges by leapingacross local optima in policy space.

We propose a family of approaches: (1)\textbf{Vanilla Foundation-Model Self-Play (vFMSP)} continually refines agentpolicies via competitive self-play; (2) \textbf{Novelty-Search Self-Play(NSSP)} builds a diverse population of strategies, ignoring performance; and(3) the most promising variant, \textbf{Quality-Diveristy Self-Play (QDSP)},creates a diverse set of high-quality policies by combining the diversity ofNSSP and refinement of vFMSP.

We evaluate FMSPs in Car Tag, acontinuous-control pursuer-evader setting, and in Gandalf, a simple AI safetysimulation in which an attacker tries to jailbreak an LLM's defenses.

In CarTag, FMSPs explore a wide variety of reinforcement learning, tree search, andheuristic-based methods, to name just a few.

In terms of discovered policyquality, \ouralgo and vFMSP surpass strong human-designed strategies.

InGandalf, FMSPs can successfully automatically red-team an LLM, breaking throughand jailbreaking six different, progressively stronger levels of defense.

Furthermore, FMSPs can automatically proceed to patch the discoveredvulnerabilities.

Overall, FMSPs represent a promising new research frontier ofimproving self-play with foundation models, opening fresh paths toward morecreative and open-ended strategy discovery

Published on arXiv on: 2025-07-09T00:58:19Z