Highlight Extraction vs. Text Summarization

These are different problems that get conflated.

Text summarization distills a large body of text into an abridged version, an abstract. The goal is compression with coverage.

Highlight extraction finds the most interesting fragment as judged by human opinion. The goal is identification of moments that matter, not comprehensive coverage.

The closest analogy in text analysis is extraction of quotes from a novel. Quotes don’t summarize a book or classify it. They give the reader a vignette of the most important fragments (moments worth preserving.

This distinction matters because the techniques that work for summarization (finding representative sentences, maximizing coverage) don’t necessarily work for highlight detection. Highlights may be interesting precisely because they’re not representative.

Related: [None yet]