Distributional Estimates Over Single Classifications

Context

You need to characterize a category (occupation, segment, cohort) based on measurements from many individual instances that vary.

Problem

Assigning a single value to the category erases meaningful variation. “Office managers require medium strength” hides the reality that 45% need light strength, 40% need medium, and 15% need sedentary. Users making decisions about specific instances inherit the aggregation error.

Solution

Report the distribution of values across instances rather than collapsing to a single classification.

The ORS produces “percentage-of-workers” estimates: not “this occupation requires X strength” but “X% of workers in this occupation require sedentary, Y% light, Z% medium.” The underlying variation becomes visible in the data.

Implementation

  • Collect measurements at the instance level (individual jobs, not occupations)
  • Aggregate by calculating proportions across classification levels
  • Publish percentiles and distributions alongside any summary statistics
  • Make the aggregation logic transparent so users understand what the numbers represent

Consequences

Benefits: Users can assess fit at different thresholds. Policy-makers see the range of reality rather than a single point estimate. The data acknowledges heterogeneity rather than hiding it.

Costs: More complex to communicate. Requires users to interpret distributions rather than simple classifications. May create decision paralysis when a single answer would have enabled action.

Tradeoffs: Distributional reporting preserves information but shifts interpretive burden to users. Some contexts need a single classification for operational reasons, in those cases, the distribution can inform where to set the threshold.

Related:, 07-molecule—operationalizing-work-requirements