Towards Practical GraphRAG

Min et al., 2025 (SAP)

Core Framing

The paper frames GraphRAG as a practical engineering challenge rather than a research frontier. The key insight: “careful engineering of classical NLP techniques can match modern LLM-based approaches while enabling practical, cost-effective, and domain-adaptable retrieval-augmented reasoning at scale.”

This positions dependency parsing, a decades-old technique, as a legitimate alternative to expensive LLM-based extraction. The framing itself is contrarian in the current hype cycle.

Key Findings

  • Dependency parsing achieves 94% of LLM-based performance (61.87% vs. 65.83% semantic alignment) for knowledge graph construction
  • GraphRAG shows 15% improvement over vanilla vector retrieval on enterprise code migration tasks
  • The construction bottleneck, not query-time latency, is the primary barrier to GraphRAG adoption
  • Hybrid retrieval (graph traversal + vector similarity via RRF) outperforms either approach alone

Methodological Contributions

  1. Dual extraction architecture supporting both dependency-based (fast/cheap) and LLM-based (accurate/expensive) paths
  2. Multi-granular embeddings with separate vectors for entities, chunks, and relations
  3. Cascaded retrieval combining high-recall graph traversal with precision-oriented re-ranking

Enterprise Context

Evaluated on SAP’s Custom Code Migration use case, migrating legacy ABAP code to S/4HANA. This is a legitimate enterprise problem with complex entity relationships (transactions, screen structures, compatibility rules) that similarity search alone can’t capture.

Limitations Acknowledged

  • Dependency parsing may miss implicit or context-dependent relations
  • Generalizability beyond code migration domain remains an open question
  • Only one-hop traversal tested; deeper traversals not explored

Strategic Value

This paper provides concrete evidence for the “vectors vs. graphs” distinction already in the garden. The 94% threshold is a useful heuristic for when “good enough” classical approaches beat expensive modern methods.


Related: 07-molecule—vectors-vs-graphs, 05-atom—the-94-percent-threshold, 06-atom—construction-bottleneck-problem