The Human-AI Boundary

There is a common mistake in thinking about what AI can and cannot do: treating the boundary as fixed. "AI can do X but not Y" stated in present tense as if it were a permanent law of nature, rather than a description of current capability at a specific moment.

The boundary between what humans do better and what AI does better is moving, and the movement is not symmetric or predictable across domains.

How the boundary moves

In most domains, AI capability is improving faster than human capability. The relevant question is not "can AI do this?" but "how long until AI does this better than most humans doing it professionally?"

The domains where AI capability has improved fastest share a structure: clear success criteria, large existing datasets, high iteration speed. Translation, image classification, game-playing, code completion — all of these fit the pattern. The domains where AI remains limited are typically those without clear success criteria (novel research), extreme context-dependence (complex negotiation), or physical embodiment requirements (fine motor tasks).

The dangerous middle zone

The most important part of the boundary is not the clear cases — it's the zone where AI performance is good enough to be used but not good enough to be trusted without supervision. In this zone, human oversight is required, but the oversight is hard to do well precisely because the AI is good enough to produce plausible-sounding outputs.

This is the current state of AI in medicine, law, and financial advice. Outputs that look authoritative, that a non-expert cannot easily evaluate, that require domain expertise to audit — and that are good enough often enough that skipping the audit is tempting.

The risk is not that AI fails in obvious ways. It's that it fails in subtle ways that get through human review precisely because human review is now concentrated on the cases that look wrong, and the cases that look right but aren't get through.

What this means for how we work

The implication is not "don't use AI in high-stakes domains." It's "invest in evaluation infrastructure proportional to the stakes." The bottleneck in a world with capable AI is not generation — it's verification. The humans who remain valuable are those who can tell good outputs from bad ones, reliably, faster than alternatives.

See also: benchmark-inversion.html, evaluation-infrastructure.html

Reply by email →