Annotation Task Suitability Framework

Definition

A framework for evaluating whether a given annotation task is appropriate for crowdsourcing, expert annotation, or AI-assisted labeling, based on task characteristics and quality requirements.

Key Dimensions

Task Complexity: Simple (binary) vs. complex (multi-step reasoning) Required Expertise: General knowledge vs. domain specialization Subjectivity: Objective ground truth vs. inherently subjective Disagreement Signal: Is annotator disagreement noise or information? Scale Requirements: Thousands of examples vs. hundreds

Suitability Matrix

Task TypeCrowdExpertAI-Assisted
Simple objective✓✓
Complex objective
Subjective✓ (aggregate)✓ (calibrate)
Specialized✓✓✓ (with expert review)

Quality Considerations

  • Inter-annotator agreement thresholds
  • Gold standard validation sets
  • Annotator calibration procedures
  • Error analysis and adjudication

Common Mistakes

  • Using crowds for expert tasks (quality problems)
  • Using experts for simple tasks (cost problems)
  • Assuming disagreement is always error
  • Insufficient annotator training

Related: 03-research-methods, 05-atom—llm-annotation-reliability-gap