The Retrieval-Generation Cascade Problem

In RAG systems, retrieval quality and generation quality are interdependent but not correlated in straightforward ways.

Good retrieval doesn’t guarantee good generation. A retriever might return highly relevant documents that contain conflicting information, overwhelming the generator’s reasoning capacity.

Poor retrieval can be partially compensated by strong generation. A capable LLM might extract value from marginally relevant documents or fall back on parametric knowledge when retrieval fails.

The cascade creates evaluation challenges: end-to-end metrics conflate component failures. When the system gives a wrong answer, was it a retrieval miss, a generation hallucination, or a failure to integrate multiple sources?

This argues for multi-stage evaluation that isolates component contributions, but isolation itself is artificial, since real performance emerges from interaction.

Related: 05-atom—internal-external-evaluation-distinction