agency

The 19% Delusion

Experienced developers using AI tools took 19% longer to complete tasks. They estimated they were 20% faster.

That's not a rounding error. That's a 39-point gap between experience and reality.

The METR study dropped this in 2025: a randomized controlled trial, actual time measurements, developers with years under their belt. The finding should make anyone using AI tools uncomfortable—not because AI failed, but because the developers couldn't tell it had.

Meanwhile, Salesforce's 2024 developer productivity research measured 30% gains on routine coding tasks—standardized work like generating boilerplate, writing test scaffolds, and translating between data formats.

Both studies are valid. Both tell the truth. The difference is task type.

Two kinds of fast

Working with AI feels productive. You're generating more output. You're moving faster through drafts. The tool responds instantly. Every interaction creates a small hit of progress.

But progress isn't productivity. Activity isn't output. And the feeling of speed isn't speed.

Coherentism teaches us to seek resonance—that internal signal when pieces fit, when a belief earns its place by cohering with everything else we hold true. It's how we navigate without complete information. But there's a failure mode: mistaking hedonic signal for epistemic fit. Something can feel right without being true.

The experienced developers in the METR study weren't fools. They were experts—which may have made the problem worse. They had strong priors about their own capability. When AI made the work feel easier, they mapped that ease onto accuracy: "I'm fast. This feels fast. Therefore I'm being fast."

The feeling of productivity cohered with their self-model. But it didn't cohere with the clock.

What actually works

The Salesforce study points to where AI delivers: routine tasks. The repetitive work. The boilerplate. The parts where human creativity isn't the bottleneck—where the bottleneck is just typing speed and lookup time.

Here's the pattern:

AI accelerates execution. It doesn't accelerate judgment.

When you need to scaffold CRUD endpoints, generate test stubs, or translate between formats—tasks with clear inputs, predictable outputs, and low ambiguity—AI compresses time. Thirty percent faster. Maybe more.

When you need to diagnose a subtle bug in a codebase you know intimately, navigate tradeoffs that require system-level thinking, or make architectural decisions—tasks where expertise is the tool—AI doesn't just fail to help. It actively costs you time. You're explaining context to a system that can't hold it, reviewing suggestions that miss the point, spending cognitive cycles managing a collaboration that isn't actually collaborating. The experts didn't break even. They lost 19%.

They were working on tasks they knew well. Their expertise was the asset. AI diluted it.

Three questions to cut through

Before you assume AI is helping:

1. Can I describe the task without referencing my specific knowledge?

"Write a function that validates email format" → AI territory. "Figure out why authentication breaks on the third retry in our system" → Your territory.

2. Would a junior developer's output be acceptable here?

AI gives you junior-developer work at high speed. Sometimes that's exactly what you need. Sometimes it's expensive distraction.

3. Am I reviewing, or am I rewriting?

If you're rewriting more than 30% of AI output, you're paying twice: once to generate, once to fix. Track this. Actually track it.

Measure or keep guessing

The developers in the study didn't measure. They estimated. Estimation, when you're inside an experience that feels good, is guessing with extra confidence.

For one week, try this:

Pick five tasks. Before starting, estimate how long they'll take with AI. Time yourself. Compare.

Not to prove AI doesn't work—it does work, in the right context. But to calibrate your sense of where it works for you, on your tasks, in your codebase.

The goal isn't to stop using AI. The goal is to stop trusting your intuition about AI until you've tested it.

Seek resonance. But verify it.

The 19% Delusion

Two kinds of fast

What actually works

Three questions to cut through

Measure or keep guessing

Read next

Skills as Composable Units

The Taste Artifact

Liberating Trapped Data