Silent Leaks: Implicit Knowledge Extraction Attack on RAG Systems through Benign Queries

Link: http://arxiv.org/abs/2505.15420v1

PDF Link: http://arxiv.org/pdf/2505.15420v1

Summary: Retrieval-Augmented Generation (RAG) systems enhance large language models(LLMs) by incorporating external knowledge bases, but they are vulnerable toprivacy risks from data extraction attacks.

Existing extraction methodstypically rely on malicious inputs such as prompt injection or jailbreaking,making them easily detectable via input- or output-level detection.

In thispaper, we introduce Implicit Knowledge Extraction Attack (IKEA), which conductsknowledge extraction on RAG systems through benign queries.

IKEA firstleverages anchor concepts to generate queries with the natural appearance, andthen designs two mechanisms to lead to anchor concept thoroughly 'explore' theRAG's privacy knowledge: (1) Experience Reflection Sampling, which samplesanchor concepts based on past query-response patterns to ensure the queries'relevance to RAG documents; (2) Trust Region Directed Mutation, whichiteratively mutates anchor concepts under similarity constraints to furtherexploit the embedding space.

Extensive experiments demonstrate IKEA'seffectiveness under various defenses, surpassing baselines by over 80% inextraction efficiency and 90% in attack success rate.

Moreover, the substituteRAG system built from IKEA's extractions consistently outperforms those basedon baseline methods across multiple evaluation tasks, underscoring thesignificant privacy risk in RAG systems.

Published on arXiv on: 2025-05-21T12:04:42Z