Build Three, Pick One
When generation is nearly free, planning is no longer cheaper than building. The smart move: write the tests, build three implementations, and let measurement replace speculation.
You're specifying requirements for a feature you've never built. Edge cases you've imagined but not encountered. Architecture diagrams for problems you haven't touched yet. You're planning because building is expensive and you don't want to waste the investment.
That math used to work. It doesn't anymore.
When generating code costs nearly nothing β when a prompt produces a working implementation in minutes β the entire cost structure that justified upfront planning collapses. The question isn't whether you can plan before building. It's whether you should, when building is now the faster way to learn.
The Sequence That Flipped
The traditional workflow has a clear logic: specify, design, build, test. You invest heavily in the early phases because mistakes discovered late are expensive to fix. Get the requirements right. Nail the architecture. Then build. This sequence optimizes for a world where construction is the bottleneck.
Luca Palmieri describes a different practice emerging among teams using agentic coding tools: prototype to discover constraints.[^1] Instead of specifying upfront what the system needs to handle, you build an implementation and observe where it breaks. The constraints reveal themselves through contact with reality β not through prediction.
The companion practice is what Palmieri calls build to compare: generate multiple implementations of the same feature, then evaluate them against each other. Not A/B testing on users. A/B testing on the problem itself. Three implementations, three different approaches, three sets of tradeoffs made visible by running code rather than drawing boxes on whiteboards.
This isn't sloppiness. It's empiricism. You're not skipping the design phase β you're moving it after the build phase, where it can operate on evidence instead of assumption.
The Instrument Comes First
But here's where most people would get this wrong.
Build three implementations without a way to evaluate them, and you have three piles of code and no basis for choosing. The comparison is subjective. The "best" one is whichever feels cleanest on a skim. You've replaced planning-without-evidence with building-without-measurement. You've traded one kind of speculation for another.
Simon Willison identifies the precondition that makes this practice work: first, run the tests.[^2] Before you generate a single line of implementation, write the test suite. The tests are the measurement instrument. They define what "works" means β not in abstract requirements language, but in executable assertions that any implementation can be run against.
With tests in place, "build three, pick one" becomes a scientific method:
- Define the hypothesis β Write tests that encode what success looks like
- Run the experiments β Generate three implementations
- Measure the results β Run each against the test suite
- Select on evidence β Pick the one that passes, performs, and reads cleanly
Without the test suite, you're building prototypes. With it, you're running experiments. The difference is the instrument.
The Compost Beneath the Winner
Two of those three implementations get discarded. In a planning-first world, that's failure β you built things you didn't ship. Resources spent, nothing to show.
But discarded prototypes aren't waste. They're data.
Implementation A revealed that the naive approach hits a database lock under concurrent writes. You wouldn't have predicted that from a requirements doc. Implementation B showed that the elegant recursive solution blows the stack at the scale you actually need. No amount of architecture review would have surfaced that β it only appears when code meets data. Implementation C works. You ship it.
But you didn't just build a feature. You built a map of the problem space. You now know where the locks are, where the stack limits bite, and which approach survives contact with reality. The "failed" prototypes composted into knowledge that makes every future decision in this area more informed.
The prototype that broke taught you something the spec never could. The failure wasn't a detour β it was the shortest path to understanding the constraint. You couldn't have planned around a problem you didn't know existed. You had to build into it.
And that knowledge compounds. The next time you face a similar problem, you don't start from speculation. You start from the empirical record of what broke and why. The compost feeds the next cycle.
Where Planning Still Has Leverage
This doesn't mean planning is dead. It means planning moves to where it earns its keep. You plan the tests β those encode what you're trying to achieve. You plan the evaluation criteria β those define what "better" means. You stop planning the implementation, because the implementation is now cheap enough to discover empirically.
Building three and picking one isn't a brute-force strategy. It's a way of listening to the problem with your hands.
Plan what to measure. Build to discover what to do.
The Practical Move
Tomorrow, when you face a feature with uncertain requirements:
- Write the test suite first. Define what success looks like in executable assertions. This is your measurement instrument β without it, comparison is opinion.
- Prompt for three implementations. Different approaches, different architectures, different tradeoffs. Let the model explore the solution space.
- Run all three against the tests. Observe which passes, which fails, and how each fails.
- Pick the winner. Compost the rest. The failed implementations are data about your problem space.
As Willison puts it: "Any time our instinct says don't build that, it's not worth the time β fire off a prompt anyway."[^2] The cost of building has fallen below the cost of deciding whether to build.
Stop planning what to build. Build to discover what to plan.
[^1]: Luca Palmieri, "Can agentic coding raise the quality bar?" β describes practices of "Prototype to Discover Constraints" and "Build to Compare" as emerging patterns in agentic coding workflows. [^2]: Simon Willison, "Writing code is cheap now" β "Any time our instinct says don't build that, it's not worth the time β fire off a prompt anyway." Also identifies writing tests first as the precondition for productive prototyping.
Sources: Luca Palmieri, "Can agentic coding raise the quality bar?"; Simon Willison, "Writing code is cheap now" (February 2026)