LLM Knowledge Archaeology

The Concept

Large language models can excavate tacit knowledge that has fragmented across an organization by systematically aggregating partial knowledge from multiple sources, without needing to identify or access the original domain expert.

Why This Matters

Traditional knowledge management assumes you need to find the expert and get them to externalize what they know. This is often the hardest part: experts are busy, their knowledge is tacit, and they may not even know what’s worth documenting.

But knowledge doesn’t stay with experts. It spreads through conversations, gets partially absorbed by colleagues, fragments into pieces held by different people. If an LLM agent can navigate an organization, ask the right questions, and assemble fragments, it can reconstruct knowledge that no single person could articulate.

This shifts the problem from “elicitation” to “aggregation.”

How It Works

The agent operates through iterative prompt-chaining:

  1. Engage: Start with someone in the hierarchy, establish context
  2. Query: Ask targeted questions based on current knowledge gaps
  3. Integrate: Update internal knowledge state with responses
  4. Critique: Self-evaluate completeness, identify remaining gaps
  5. Navigate: Decide whether to continue with current source or follow referrals to others
  6. Repeat: Until knowledge state reaches acceptable completeness

The self-critique loop is essential: the agent must know what it doesn’t know to ask better questions.

Limitations

  • Requires knowledge to have actually spread (won’t work if expert hoarded information)
  • Quality depends on informal network density
  • Self-critique is imperfect proxy for actual completeness
  • Tested in simulation; real organizational dynamics may differ
  • Works for structured knowledge (database schemas); less clear for deeply tacit skills

When This Applies

Organizations where:

  • Domain experts are unavailable, overloaded, or have left
  • Knowledge has had time to disperse through informal channels
  • The goal is documentation rather than skill transfer
  • Formal hierarchy doesn’t reflect actual knowledge flow

Related: 06-molecule—distributed-knowledge-reconstruction, 06-molecule—seci-framework, 06-atom—tacit-knowledge-definition