LLM Hallucination Taxonomy

Overview

A two-branch classification system for hallucinations in large language models, designed for the general-purpose LLM era rather than task-specific NLG systems.

Hallucination
├── Factuality Hallucination (vs. real-world facts)
│   ├── Factual Contradiction
│   │   ├── Entity error
│   │   └── Relation error
│   └── Factual Fabrication
│       ├── Unverifiability
│       └── Overclaim
│
└── Faithfulness Hallucination (vs. user input / internal consistency)
    ├── Instruction inconsistency
    ├── Context inconsistency
    └── Logical inconsistency

Why This Taxonomy

Prior taxonomies (intrinsic vs. extrinsic hallucination) assumed task-specific models with clear source contexts. An intrinsic hallucination contradicts the source document; an extrinsic hallucination introduces unverifiable information not in the source.

This breaks down for LLMs because:

  1. Many interactions have no source document, the “source” is world knowledge
  2. Users care distinctly about whether the model followed their instructions vs. whether it stated true facts
  3. RAG systems need both dimensions: faithfulness to retrieved context AND factual accuracy

How to Apply

For detection: Different hallucination types require different verification approaches.

  • Factual contradictions → Check against knowledge bases or search
  • Factual fabrications → May be undetectable without domain expertise
  • Instruction inconsistency → Compare output against parsed instruction intent
  • Context inconsistency → NLI/entailment checking against provided context
  • Logical inconsistency → Verify reasoning chain coherence

For mitigation: Causes map to types.

  • Data issues → Primarily factuality hallucinations (imitative falsehood, knowledge gaps)
  • Training issues → Both types (sycophancy affects both, SFT mismatch affects factuality)
  • Inference issues → Primarily faithfulness (attention drift, decoding randomness)

For user communication: The taxonomy helps explain what went wrong.

  • “The model didn’t follow your instructions” ≠ “The model stated false facts”
  • Users can calibrate trust differently for each dimension

When This Especially Matters

  • RAG system design: Context inconsistency is the failure mode to watch
  • High-stakes applications: Factual contradiction vs. fabrication has different risk profiles
  • Evaluation: Metrics should capture both dimensions separately
  • Root cause analysis: Taxonomy helps trace hallucinations to likely causes

Limitations

The categories aren’t always clean. A response might be:

  • Unfaithful to context AND factually wrong
  • Faithful to incorrect retrieved content
  • Logically consistent in its wrongness

The taxonomy is a diagnostic lens, not a complete partition.

Related: 05-molecule—hallucination-causes-lifecycle, 05-molecule—capability-alignment-gap, 07-molecule—ui-as-ultimate-guardrail