The ShareAlike Derivative Work Ambiguity

ShareAlike licenses (CC BY-SA, ODbL) require that derivative works be distributed under the same license. In software, “derivative work” has established legal interpretation. In data, it’s murky.

Unanswered questions for enterprise data use:

  • Does joining ShareAlike data with proprietary data create a derivative?
  • Does internal enrichment without external distribution trigger obligations?
  • Does querying a database and using results constitute derivation?
  • If you train a model on ShareAlike data, is the model a derivative?

The legal theory hasn’t caught up with common data practices. Different lawyers give different answers.

Practical implication: ShareAlike-licensed datasets (DBpedia, YAGO, ConceptNet) may be valuable, but shouldn’t be deployed without explicit legal guidance. Internal-only use might not trigger ShareAlike, but “might” isn’t good enough for compliance.

The safe path: Prioritize public domain and attribution-only datasets. Use ShareAlike datasets only after formal legal assessment of your specific use case.

Related: 04-atom—license-tier-framework, 04-atom—data-governance