Underutilized Data Dependencies

Input signals that provide little incremental modeling benefit but remain in the system, making it unnecessarily vulnerable to change.

These dependencies creep in through several paths:

Legacy features: Included early in development, made redundant by later additions, but never removed.

Bundled features: Evaluated as a group, found beneficial, and added together under deadline pressure, including features that individually add little value.

ε-features: Small accuracy gains that seemed worth the complexity at the time.

Correlated features: Two features are strongly correlated, but one is more directly causal. The model can’t distinguish them, so it credits both. If correlations shift, the system breaks.

Detection requires exhaustive leave-one-feature-out evaluations. These should run regularly to identify and remove unnecessary dependencies before they become liabilities.

Related: 04-atom—unstable-data-dependencies, 05-molecule—ml-technical-debt-taxonomy