Research Gaps in Data Quality Definition
Key open questions identified in systematic review of data quality literature:
Consensus gaps:
- Can quasi-standards emerge for general-purpose data quality, or is context-specificity inherent?
- Should the field standardize the set of quality dimensions, even if definitions vary?
- How should data quality definitions adapt to new types of data (streaming, IoT, AI training data)?
Definition completeness gaps:
- Most definitions list attributes without providing quality requirements: what would requirement-based definitions look like in practice?
- Few definitions include guiding principles for quality enhancement, how should frameworks connect assessment to improvement?
Contextual relationship gaps:
- Societal context (bias, provenance, diversity) is underrepresented in most frameworks, how should this be integrated?
- System-level quality (availability, security) is often treated separately from data quality, should they be unified?
Methodological gaps:
- Few definitions explain their provenance (intuitive, theoretical, empirical): how does methodology affect validity?
- How can competing definitions be reconciled without losing the insights unique to each?
Related: 04-atom—data-quality-consensus-gap, 04-atom—vocabulary-problem-dq, 04-molecule—dq-contextual-relationships