Hallucination as Untraceable Accuracy
In knowledge engineering tasks, LLM “hallucination” takes on a specific character: the generated content may be factually correct but isn’t traceable to the input data.
When researchers asked LLMs to create ontologies from interview transcripts, 19-32% of generated classes weren’t mentioned in the interviews. Many of these were accurate within the domain, the LLM drew from training data rather than fabricating. But they couldn’t be traced back to the expert interview, which defeats the purpose of knowledge elicitation.
This is a different problem than hallucinating false facts. The issue is provenance: correct information from unknown sources undermines the validation pipeline.
Related: 04-atom—provenance-design, 05-atom—uniform-confidence-problem