RAG-Specific Attack Vectors
Retrieval augmentation introduces attack surfaces that don’t exist in standalone LLMs:
Knowledge Database Poisoning: Injecting malicious documents into the corpus that trigger predetermined outputs when retrieved. The attack exploits the system’s trust in retrieved content.
Retrieval Hijacking: Manipulating ranking algorithms to prioritize malicious content during retrieval. Works by exploiting how embeddings cluster semantically.
Phantom Attacks: Inserting trigger-activated documents that appear benign until specific queries activate them.
Jamming Attacks: Flooding the corpus with “blocker” documents that force the system to refuse legitimate queries.
The common thread: RAG systems inherit not just the knowledge in their corpus, but also its vulnerabilities. Every document in the retrieval index is a potential attack surface.
Current research indicates defense mechanisms remain insufficient against sophisticated attacks. The retrieval-generation interplay creates compound vulnerabilities that neither component would exhibit alone.
Related:, 04-atom—provenance-design