Towards Practical GraphRAG

Min et al., 2025 (SAP)

Core Framing

The paper frames GraphRAG as a practical engineering challenge rather than a research frontier. The key insight: “careful engineering of classical NLP techniques can match modern LLM-based approaches while enabling practical, cost-effective, and domain-adaptable retrieval-augmented reasoning at scale.”

This positions dependency parsing, a decades-old technique, as a legitimate alternative to expensive LLM-based extraction. The framing itself is contrarian in the current hype cycle.

Key Findings

Dependency parsing achieves 94% of LLM-based performance (61.87% vs. 65.83% semantic alignment) for knowledge graph construction
GraphRAG shows 15% improvement over vanilla vector retrieval on enterprise code migration tasks
The construction bottleneck, not query-time latency, is the primary barrier to GraphRAG adoption
Hybrid retrieval (graph traversal + vector similarity via RRF) outperforms either approach alone

Methodological Contributions

Dual extraction architecture supporting both dependency-based (fast/cheap) and LLM-based (accurate/expensive) paths
Multi-granular embeddings with separate vectors for entities, chunks, and relations
Cascaded retrieval combining high-recall graph traversal with precision-oriented re-ranking

Enterprise Context

Evaluated on SAP’s Custom Code Migration use case, migrating legacy ABAP code to S/4HANA. This is a legitimate enterprise problem with complex entity relationships (transactions, screen structures, compatibility rules) that similarity search alone can’t capture.

Limitations Acknowledged

Dependency parsing may miss implicit or context-dependent relations
Generalizability beyond code migration domain remains an open question
Only one-hop traversal tested; deeper traversals not explored

Strategic Value

This paper provides concrete evidence for the “vectors vs. graphs” distinction already in the garden. The 94% threshold is a useful heuristic for when “good enough” classical approaches beat expensive modern methods.

>heyMHK

Towards Practical GraphRAG: Efficient Knowledge Graph Construction and Hybrid Retrieval at Scale