Textual Similarity vs Conceptual Similarity in Taxonomy Building
When merging overlapping classification schemes, two distinct similarity types require different handling:
Textual similarity: Terms with identical definitions, synonymous names, or pointing to the same instances. Can be merged through string matching and definition comparison.
Conceptual similarity: Terms that differ in terminology but describe essentially similar phenomena. Requires human judgment: reading descriptions, examining examples, comparing classification context. Two annotators independently assess, with a third resolving disagreements.
The distinction matters because automated deduplication catches textual overlap but misses conceptual redundancy. Comprehensive taxonomies require both mechanical and expert synthesis.