Research Gaps in Data Quality Definition

Key open questions identified in systematic review of data quality literature:

Consensus gaps:

  • Can quasi-standards emerge for general-purpose data quality, or is context-specificity inherent?
  • Should the field standardize the set of quality dimensions, even if definitions vary?
  • How should data quality definitions adapt to new types of data (streaming, IoT, AI training data)?

Definition completeness gaps:

  • Most definitions list attributes without providing quality requirements: what would requirement-based definitions look like in practice?
  • Few definitions include guiding principles for quality enhancement, how should frameworks connect assessment to improvement?

Contextual relationship gaps:

  • Societal context (bias, provenance, diversity) is underrepresented in most frameworks, how should this be integrated?
  • System-level quality (availability, security) is often treated separately from data quality, should they be unified?

Methodological gaps:

  • Few definitions explain their provenance (intuitive, theoretical, empirical): how does methodology affect validity?
  • How can competing definitions be reconciled without losing the insights unique to each?

Related: 04-atom—data-quality-consensus-gap, 04-atom—vocabulary-problem-dq, 04-molecule—dq-contextual-relationships