Sharma 2025 — RAG Survey

Full Title: Retrieval-Augmented Generation: A Comprehensive Survey of Architectures, Enhancements, and Robustness Frontiers

Author: Chaitanya Sharma (Independent Researcher)

Published: May 2025 (preprint under review at ACM TOIS)

Link: https://arxiv.org/abs/2506.00054

Framing

The paper positions RAG as a paradigm shift from purely parametric LLM knowledge to hybrid architectures that combine static model weights with dynamic document retrieval. The key framing tension: while RAG addresses critical limitations of parametric storage (hallucinations, staleness, domain inflexibility), it introduces entirely new failure modes in retrieval quality, grounding fidelity, pipeline efficiency, and robustness against noisy or adversarial inputs.

This is fundamentally a systems architecture paper, not just an AI paper. The insights about tradeoffs between modularity and coordination, efficiency and faithfulness apply broadly to any complex system design.

Core Contribution

A taxonomy organizing RAG architectures by where innovation occurs:

  1. Retriever-centric: query-driven, retriever adaptation, granularity-aware
  2. Generator-centric: faithfulness-aware decoding, context compression, retrieval-guided generation
  3. Hybrid: iterative retrieval, utility-driven optimization, dynamic triggering
  4. Robustness-oriented: noise-adaptive, hallucination-aware, adversarial defenses

Key Insights Extracted

Recurring Tradeoffs

The survey identifies three persistent tensions in RAG system design:

  1. Retrieval precision vs. generation flexibility
  2. Efficiency vs. faithfulness
  3. Modularity vs. coordination

Strategic Value

This survey is useful for:

  • Understanding the design space of knowledge-augmented AI systems
  • Identifying failure modes before they appear in production
  • Recognizing that retrieval is not a solved problem, it’s an active area of architectural innovation