Database Schema as Ontology Source

Context

You want to build a knowledge graph for RAG but lack a formal ontology. Learning one from text is expensive (repeated LLM inference) and requires complex merging as new documents arrive.

Problem

Text-based ontology learning creates a maintenance burden: every new document can introduce redundant or conflicting entities, requiring sophisticated alignment pipelines. The ontology keeps changing as the corpus grows.

Solution

Extract your ontology from existing relational database schemas instead of from text. The schema already encodes:

  • Entity types (tables)
  • Attributes (columns)
  • Relationships (foreign keys)
  • Constraints (validation rules)

The process:

  1. Parse DDL statements to extract tables, columns, and keys
  2. Use an LLM to generate ontology elements from the schema structure
  3. Validate and merge with any reference domain ontologies
  4. Use the resulting ontology to constrain KG construction from your text corpus

Consequences

Benefits:

  • One-time ontology learning (schemas are stable)
  • No complex merging as content grows
  • Schema reflects existing domain expertise
  • Significantly lower LLM inference costs
  • Comparable RAG performance to text-derived ontologies

Tradeoffs:

  • Requires an existing database with good schema design
  • May miss concepts not represented in the database
  • Schema evolution still requires ontology updates (but schema changes are infrequent)

When to Use

This pattern works when:

  • You have a well-designed relational database capturing domain entities
  • Your text corpus relates to the same domain as the database
  • You want graph-based RAG benefits without ongoing ontology maintenance
  • Cost efficiency matters

Related: 06-atom—ontology-guided-kg-construction, 06-molecule—structure-plus-content, 06-molecule—knowledge-graph-construction