The AI Diversity Deficit

AI systems generating ideas produce outputs that cluster more tightly together than ideas from groups of humans working independently.

In controlled experiments, GPT-4 with standard prompts achieved 0.377 cosine similarity between ideas, while aggregated human ideas showed 0.243, a substantial gap. The model produces competent, well-structured ideas, but they occupy a narrower region of the solution space.

This matters because innovation success depends on the best idea in the pool, not the average. A more diverse pool is more likely to contain an exceptional outlier. AI’s tendency toward the middle of the distribution limits its value in contexts where variance is the goal.

Related: 05-atom—cot-diversity-effect, 05-atom—idea-exhaustion-dynamics, 05-atom—uniform-confidence-problem