Retrieval Substitutes for Scale
A moderately-sized model with good retrieval can match the performance of a much larger model without retrieval.
DeepMind’s RETRO (2022) demonstrated this directly: a 7.5B parameter model with access to a 2-trillion-token retrieval corpus matched GPT-3’s 175B parameters on many tasks. The retrieval mechanism effectively provided the “knowledge” that would otherwise need to be baked into additional model parameters.
This has significant practical implications: rather than scaling model size indefinitely (with associated compute and memory costs), systems can invest in better retrieval infrastructure and larger knowledge corpora. The knowledge lives in the corpus, not the weights.
The pattern suggests a design principle: when knowledge is the bottleneck, add retrieval before adding parameters.
Related: 05-atom—rag-definition