LLM Hallucination Taxonomy
Overview
A two-branch classification system for hallucinations in large language models, designed for the general-purpose LLM era rather than task-specific NLG systems.
Hallucination
├── Factuality Hallucination (vs. real-world facts)
│ ├── Factual Contradiction
│ │ ├── Entity error
│ │ └── Relation error
│ └── Factual Fabrication
│ ├── Unverifiability
│ └── Overclaim
│
└── Faithfulness Hallucination (vs. user input / internal consistency)
├── Instruction inconsistency
├── Context inconsistency
└── Logical inconsistency
Why This Taxonomy
Prior taxonomies (intrinsic vs. extrinsic hallucination) assumed task-specific models with clear source contexts. An intrinsic hallucination contradicts the source document; an extrinsic hallucination introduces unverifiable information not in the source.
This breaks down for LLMs because:
- Many interactions have no source document, the “source” is world knowledge
- Users care distinctly about whether the model followed their instructions vs. whether it stated true facts
- RAG systems need both dimensions: faithfulness to retrieved context AND factual accuracy
How to Apply
For detection: Different hallucination types require different verification approaches.
- Factual contradictions → Check against knowledge bases or search
- Factual fabrications → May be undetectable without domain expertise
- Instruction inconsistency → Compare output against parsed instruction intent
- Context inconsistency → NLI/entailment checking against provided context
- Logical inconsistency → Verify reasoning chain coherence
For mitigation: Causes map to types.
- Data issues → Primarily factuality hallucinations (imitative falsehood, knowledge gaps)
- Training issues → Both types (sycophancy affects both, SFT mismatch affects factuality)
- Inference issues → Primarily faithfulness (attention drift, decoding randomness)
For user communication: The taxonomy helps explain what went wrong.
- “The model didn’t follow your instructions” ≠ “The model stated false facts”
- Users can calibrate trust differently for each dimension
When This Especially Matters
- RAG system design: Context inconsistency is the failure mode to watch
- High-stakes applications: Factual contradiction vs. fabrication has different risk profiles
- Evaluation: Metrics should capture both dimensions separately
- Root cause analysis: Taxonomy helps trace hallucinations to likely causes
Limitations
The categories aren’t always clean. A response might be:
- Unfaithful to context AND factually wrong
- Faithful to incorrect retrieved content
- Logically consistent in its wrongness
The taxonomy is a diagnostic lens, not a complete partition.
Related: 05-molecule—hallucination-causes-lifecycle, 05-molecule—capability-alignment-gap, 07-molecule—ui-as-ultimate-guardrail