HAIC Evaluation by Collaboration Mode

Overview

Different modes of Human-AI Collaboration require different evaluation approaches. What you measure should match how the system distributes authority and tasks.

The Three Modes

AI-Centric: AI leads, humans monitor or receive outputs. Human-Centric: Humans lead, AI augments their capabilities. Symbiotic: Balanced partnership with mutual adaptation.

Evaluation by Mode

AI-Centric Evaluation

Primary focus: AI system performance

  • Prediction accuracy, precision, recall
  • Processing efficiency, response time
  • Error rates and failure modes
  • Robustness under varied conditions

Human factors matter less here, the human is primarily a recipient of AI output. But trust calibration still matters for the decisions humans make based on AI recommendations.

Human-Centric Evaluation

Primary focus: User experience and augmentation value

  • Clarity of communication
  • Ease of use and learning curve
  • Task completion time with vs. without AI
  • User confidence and satisfaction
  • Expertise utilization (is the AI helping humans do what they do best?)

The AI succeeds when humans feel more capable, not when the AI demonstrates capability.

Symbiotic Evaluation

Primary focus: Collaboration quality and mutual adaptation

  • Adaptability scores (how well do both parties adjust?)
  • Dynamic task allocation effectiveness
  • Feedback loop quality and impact
  • Trust development over time
  • Joint decision-making outcomes
  • Error reduction through collaboration

This is the hardest to measure because success emerges from the relationship, not either party alone.

When to Use Each

Use AI-Centric evaluation when: the AI operates with minimal human intervention, decisions are relatively standardized, human role is primarily oversight.

Use Human-Centric evaluation when: humans retain primary authority, AI serves as a tool or assistant, user experience determines adoption and effectiveness.

Use Symbiotic evaluation when: both parties contribute unique capabilities, tasks are complex or novel, outcomes depend on how well human and AI work together.

Limitations

This framework assumes you can cleanly categorize a system into one mode. Reality is messier, many systems shift modes based on task type or confidence levels. The framework provides a starting point, not a complete answer.

Related: 05-atom—hmi-to-haic-shift, 05-atom—learning-to-defer-paradigm, 07-molecule—evaluation-methods-tradeoff