RAG Architecture Taxonomy

Overview

A framework for categorizing RAG systems by where architectural innovation occurs. This taxonomy helps identify the design space and understand tradeoffs between different approaches.

Components

1. Retriever-Centric Systems

Innovation happens before generation. The retriever bears responsibility for quality.

Sub-patterns:

Query-driven: Refine queries before retrieval (decomposition, rewriting, reformulation)
Retriever-centric adaptation: Modify the retriever itself through learning or architecture changes
Granularity-aware: Optimize the unit of retrieval (documents vs. passages vs. sentences)

Tradeoff: Preserves modularity and interpretability, but introduces latency and sensitivity to ambiguous queries.

2. Generator-Centric Systems

Innovation happens during generation. The generator compensates for retrieval imperfections.

Sub-patterns:

Faithfulness-aware decoding: Self-reflection, verification, or correction during generation
Context compression: Optimize retrieved content into denser representations
Retrieval-guided generation: Modulate generation based on retrieval metadata

Tradeoff: Can recover from suboptimal retrieval, but requires more sophisticated generation architectures.

3. Hybrid Systems

Innovation spans both retriever and generator through tight coupling.

Sub-patterns:

Iterative/multi-round retrieval: Interleave retrieval and generation across reasoning steps
Utility-driven optimization: Align retriever outputs with generation objectives end-to-end
Dynamic retrieval triggering: Decide when to retrieve based on model uncertainty

Tradeoff: Most powerful but hardest to train, debug, and deploy. Coordination complexity.

4. Robustness-Oriented Systems

Innovation targets failure modes under adversarial or degraded conditions.

Sub-patterns:

Noise-adaptive training: Expose models to perturbed, irrelevant, or misleading contexts
Hallucination-aware constraints: Enforce grounding during decoding
Adversarial defenses: Protect against corpus poisoning and semantic backdoors

Tradeoff: Essential for production but adds training and inference overhead.

When to Use

Diagnosing existing systems: Map current architecture to understand where improvements might have most impact
Designing new systems: Choose approach based on primary constraints (latency vs. accuracy vs. robustness)
Understanding literature: Quickly categorize new papers and their contributions

Limitations

Many real systems are hybrids that don’t fit cleanly into one category
The taxonomy emphasizes architecture over data quality, which is often more important in practice
Doesn’t capture deployment concerns like cost, scaling, or maintenance burden

>heyMHK

RAG Architecture Taxonomy

RAG Architecture Taxonomy

Overview

Components

1. Retriever-Centric Systems

2. Generator-Centric Systems

3. Hybrid Systems

4. Robustness-Oriented Systems

When to Use

Limitations

Properties

Graph view

Table of Contents

Backlinks