Ontology Generation using Large Language Models

Citation

Lippolis, A.S., Saeedizade, M.J., Keskisärkkä, R., et al. (2025). Ontology Generation using Large Language Models. arXiv:2503.05388v1 [cs.AI].

Core Question

To what extent can LLMs generate OWL ontologies from natural language requirements (user stories and competency questions) that meet the needs of ontology engineers?

Key Contributions

  1. Two new prompting techniques for automated ontology development:

    • Memoryless CQbyCQ: Processes each competency question independently, reducing context size
    • Ontogenia: Chain-of-thought approach with metacognitive prompting and ontology design patterns
  2. Multi-dimensional evaluation framework combining:

    • Standard ontology metrics (OOPS! pitfall scanner)
    • Proportion of modelled competency questions
    • Structural analysis (superfluous elements)
    • Expert qualitative evaluation
  3. Benchmark dataset: 10 ontologies, 100 CQs, 29 user stories from real-world semantic web projects

Main Findings

  • OpenAI o1-preview with Ontogenia produced best results (0.97-1.0 adequate CQ modeling)
  • Both techniques significantly outperformed novice ontology engineers
  • Reducing context size improved performance (Memoryless outperformed memory-based approach)
  • LLMs struggle with complex patterns (reification, restrictions) but handle simple properties well
  • LLMs consistently over-generate, producing superfluous classes and properties
  • Common errors: incorrect domain/range restrictions, erroneous inverse properties, overlapping elements

Relevance to heyMHK

  • Direct application to knowledge engineering methodology
  • Evidence on LLM capabilities for structured knowledge tasks
  • Multi-dimensional evaluation approach applicable to other LLM output assessment
  • Context window management insights transfer to other prompting scenarios
  • The “superfluous generation” pattern appears across LLM tasks requiring precision

Extracted Content

Atoms:

Molecules: