Learning to Defer (L2D)
A paradigm where AI systems are trained to recognize their own limitations and explicitly defer decisions to human judgment when necessary.
Rather than always producing an output, L2D systems ask: “Am I confident enough to decide this, or should a human handle it?” The AI learns not just to predict outcomes but to predict when its predictions are likely to be unreliable.
This creates dynamic task allocation. The AI handles cases it’s confident about; humans handle edge cases, novel situations, and high-stakes decisions where human judgment adds value.
Recent extensions include:
- Multi-expert L2D: Systems that can defer to different human experts based on case characteristics
- Cost-sensitive L2D: Factoring in the cost of deferral (human time, workload constraints) when deciding whether to defer
- Complementary L2D: Training models specifically to complement the capabilities of available human experts
L2D represents a mature approach to human-AI collaboration: the AI doesn’t pretend to handle everything, and humans aren’t burdened with reviewing everything.
Related: 05-atom—haic-three-modes, 05-atom—trust-calibration-problem