Approaches to Deriving Data Quality Definitions

Overview

Data quality definitions emerge from three distinct methodological traditions:

ApproachMethodStrengthsWeaknesses
IntuitiveExpert judgment, practitioner experiencePractical, fast to developLacks rigor, hard to validate
TheoreticalOntological analysis, information theoryRigorous foundationsMay miss practical concerns
EmpiricalUser studies, surveys of data consumersGrounded in actual needsContext-specific, may not generalize

When Each Applies

Intuitive approaches work when:

  • Quick definition is needed for a specific project
  • Deep domain expertise is available
  • The context is well-understood and stable

Theoretical approaches work when:

  • Foundational definitions are needed that can be built upon
  • The goal is cross-domain applicability
  • Formal reasoning about quality is required

Empirical approaches work when:

  • User-facing quality matters most
  • The definition must reflect actual (not assumed) quality needs
  • Validation against real-world usage is critical

Key Differences

Foundation: Intuitive definitions rest on experience; theoretical on formal models; empirical on data from users.

Validation: Intuitive definitions are hard to validate objectively; theoretical definitions can be checked for internal consistency; empirical definitions can be validated against user satisfaction.

Generalizability: Theoretical definitions aim for universality; empirical definitions reflect the population studied; intuitive definitions reflect the expert’s experience.

The Pattern

The most robust quality frameworks combine approaches: theoretical foundations establish the space of possible dimensions, empirical studies identify which dimensions matter to actual users, and practitioner intuition fills gaps and guides prioritization.

Related: 04-atom—data-quality-consensus-gap, 03-molecule—foda-taxonomy-methodology, 04-atom—fitness-for-use-definition