Investigating Knowledge Elicitation Automation with Large Language Models
van den Bent, Pernisch & Schlobach (2025)
Citation
van den Bent, S., Pernisch, R., & Schlobach, S. (2025). Investigating Knowledge Elicitation Automation with Large Language Models. Transportation Research Record.
Core Question
Can LLMs replace or assist in the expensive, time-intensive process of extracting expert knowledge and encoding it into formal ontologies?
Method
Compared four pipeline variants for knowledge elicitation:
- AI interview → AI ontology
- AI interview → Human ontology
- Human interview → AI ontology
- Human interview → Human ontology
Used Dungeons & Dragons domain as test case (well-documented, accessible experts, training data available). Evaluated ontologies against a manually-created “base truth” using OQuaRE metrics and structural analysis.
Key Findings
Interview Phase:
- AI interviews were 3.5x faster (~10 min vs ~35 min)
- AI responses more structured, information-dense, on-topic
- Human interviews captured tacit knowledge and nuance AI missed
Ontology Generation:
- Human-created ontologies captured more information, better hierarchical alignment
- AI-created ontologies more standardized but consistently smaller
- 19-32% of AI-generated classes were “hallucinated” (not in interview data)
- AI struggled with class vs. instance distinction
Hybrid Approach: Best results came from AI-led interviews + human-led ontology construction, combining speed of AI data collection with human semantic structuring ability.
Transferable Insights
- LLMs excel at explicit knowledge collection, struggle with tacit→explicit conversion
- Structural consistency and semantic richness are in tension
- Hallucination in knowledge engineering means facts may be correct but untraceable
- Content moderation creates domain modeling blind spots (AI omitted “Race” entirely)
- Competency questions reveal gaps, if not explicitly asked, information isn’t captured
Connections
- 06-molecule—seci-framework - Maps to Combination quadrant; humans needed for Externalization
- 06-atom—tacit-knowledge - Core limitation acknowledged
- 07-molecule—hybrid-human-ai-workflows - Key recommendation
- 06-molecule—llm-hallucination-knowledge-engineering
GitHub Repository
https://github.com/sheridavandenbent/automated-knowledge-elicitation