RAG security testing checklist for enterprise AI products

How to test retrieval-augmented generation systems for tenant leakage, prompt injection, poisoning, and sensitive data exposure.

RAG securityAI pentestprompt injectiondata leakage

Retrieval-augmented generation changes the security problem. The model is no longer answering only from a prompt and its training data. It is answering from retrieved documents, embeddings, metadata, user context, permissions, and tool outputs. That makes RAG security a data access problem as much as an AI problem.

The first test area is retrieval authorization. If a user cannot access a document in the product, they should not be able to retrieve it through semantic search, summarization, citations, cached chunks, autocomplete, or follow-up questions. Test across tenants, roles, teams, archived records, deleted documents, and recently changed permissions.

The second area is indirect prompt injection. A malicious instruction hidden inside a document, ticket, webpage, email, or knowledge base article can attempt to override the system prompt when retrieved. Good testing checks whether the model follows untrusted document instructions, leaks data, calls tools, or changes behavior after retrieval.

The third area is poisoning. Can an attacker get content into the index that influences future outputs? Can low-trust sources outrank canonical documents? Can metadata, filenames, chunk order, or repeated phrases manipulate retrieval? Poisoning tests should include both content and ranking abuse.

The fourth area is sensitive output handling. Even if retrieval is authorized, the system may summarize secrets, credentials, PHI, customer data, or internal notes into contexts where they do not belong. Test redaction, source citations, export flows, conversation sharing, logs, and analytics events.

Finally, test tool interaction. RAG often feeds agents that can create tickets, send emails, update CRM records, query databases, or trigger workflows. If retrieved text can influence tool calls, the impact moves from information disclosure to action abuse.

DeepScan AI testing records multi-turn transcripts, retrieved context, tool calls, and observed impact. That evidence is what lets AI engineers fix the actual failure rather than debate whether a prompt was merely weird.