LLM Stereotype Defaults

Without explicit intervention, LLMs default to stereotypical representations that narrow demographic range and reinforce conventional patterns.

When generating personas via zero-shot prompting, GPT-4o produced:

  • Age range of 27-50 (vs. human-crafted 10-72)
  • Exclusively white-collar tech/business occupations
  • Zero non-binary gender representation (vs. 10% in human-crafted set)
  • Hobbies that align with professional roles
  • Uniformly positive technology attitudes

All ten AI-generated personas were rated as stereotypical by expert coders. The model gravitates toward probable rather than possible, producing technically valid but demographically narrow outputs.

This isn’t a bug to be fixed with better prompts, it’s a structural feature of how language models learn from aggregate data. Diversity requires deliberate design.

Related: 07-atom—dq-dimensions-ai-training-data, 01-molecule—ai-persona-generation-risks, 05-atom—occupation-hobby-alignment-tells