Shimizu & Hitzler 2024
Full Title: Accelerating Knowledge Graph and Ontology Engineering with Large Language Models
Authors: Cogan Shimizu (Wright State), Pascal Hitzler (Kansas State)
Publication: Preprint, November 2024
Core Argument
Modularity is the key enabler for LLM-based knowledge graph and ontology engineering. Large ontologies defy both human comprehension and LLM processing, but breaking them into conceptually coherent modules dramatically improves results on hard KGOE tasks.
Key Contributions
The paper proposes using “conceptual modules” (partitions of ontologies that make sense to domain experts, as the organizational unit for LLM-based tasks. These modules serve as bridges between human conceptualization and data structure.
The striking finding: On complex ontology alignment, providing full ontologies to an LLM produced “essentially unusable” results. But a two-stage approach (first identify relevant modules, then work within them) achieved 95% accuracy on the same benchmark.
Transferable Insights
-
Conceptual coherence improves LLM performance: Modules aren’t just smaller; they’re conceptually bounded. This tighter scope provides better “priming” for the LLM.
-
Two-stage prompting pattern: First ask which modules are relevant, then work within those modules. Generalizable beyond ontology work.
-
LLMs as approximate knowledge bases: Useful framing that acknowledges both capability and limitation. Can be “approximately queried” but require human verification.
-
The KGOE task hierarchy: Modeling → Alignment → Population → Entity Disambiguation. Ordered by abstraction level.
Related Work
- MOMo (Modular Ontology Modeling) methodology
- Ontology Design Patterns (ODPs)
- OAEI GeoLink benchmark for complex alignment