Entity Linking Reduces Dimensionality by Collapsing Synonyms
When you link extracted entities to a knowledge base (like DBpedia), something useful happens beyond enrichment: dimensionality reduction.
Synonymous named entities and noun phrases collapse into the same type and hypernym branch. “Colorectal cancer,” “CRC,” and “colon cancer” all resolve to the same concept. Instead of three separate codes to track, you have one.
In a study of 133 market research projects:
- 39% of code candidates linked to resources with >0.9 confidence
- 16% had full resource, type, and hypernym links
- Ratio of matched candidates to hypernyms was ~6:1
This means entity linking isn’t just about adding context. It’s about reducing the vocabulary you need to reason over. Fewer distinct codes means more tractable analysis.
The practical implication: even imperfect entity linking provides value. A 39% link rate still collapses a significant chunk of your vocabulary.
Related: 06-atom—entity-linking, 06-molecule—knowledge-graph-construction