The Compression Theory of Understanding

Understanding is compression. When you understand something, you can generate the specific from the general — predict the next case from the pattern, derive the detail from the principle. When you don't understand it, you can only recite what you've been told.

This is why explaining something is a test of understanding. Explanation forces you to generate — to produce the thing from your model, not just retrieve it from memory. If your explanation breaks down at a specific question, that's where your compression fails. The failure location tells you exactly what you don't understand.

The implication for learning: memorization and understanding are not on the same axis. You can memorize a lot without understanding anything. Understanding requires building a generative model — something that can produce outputs you haven't seen before. Memorization produces a lookup table. Understanding produces a function.

Compression quality as epistemic metric

A better understanding produces a smaller description of the same domain. Newton's laws compress a huge range of mechanical phenomena into three statements. Darwin's insight about variation and selection compresses an enormous diversity of biological observations into one mechanism. The compression ratio is a rough proxy for explanatory power.

This means understanding is measurable, at least in principle. The question "how well do you understand X?" can be operationalized as "how compactly can you represent X, while still being able to derive arbitrary specific instances?" A domain expert's representation is compact and generative. A novice's representation is verbose and brittle.

Where the theory breaks

The compression model of understanding works well for rule-governed domains — physics, mathematics, formal systems. It's less clean for domains where the structure is contested or where context-dependence is extreme.

Knowing when to apply which rule is often the hard part, and that meta-knowledge doesn't compress neatly. An expert doctor's knowledge can't be fully expressed as a decision tree — some of what they know is tacit, pattern-based, resistant to explicit formulation. Compression captures the explicit structure; it misses the embodied part.

The useful version of this theory: compression is a necessary but not sufficient condition for understanding. You can't understand without a generative model. But having a generative model doesn't mean you have the full picture.

Derived from work in algorithmic information theory (Kolmogorov, Solomonoff) and predictive processing frameworks in cognitive science.

Reply by email →