LLMs Over-Generate Structured Output
When asked to produce structured artifacts like ontologies, schemas, or formal models, LLMs tend to generate more elements than necessary rather than fewer.
The pattern shows up as: redundant classes that could be merged, multiple properties with identical domain and range, elements that aren’t needed to satisfy the stated requirements. In one benchmark, superfluous element rates reached 40-60% for some models.
This isn’t random padding, the model is generating plausible extensions based on what typically appears in similar structures. It’s pattern completion applied to formal systems, and formal systems punish redundancy differently than natural language does.
The practical implication: when using LLMs for structured output, plan for pruning rather than elaboration. The editing task is identifying what to remove, not what to add. Review processes should be calibrated for this asymmetry.
Related: [None yet]