A recent piece by Andrea Laforgia on Expectation-Driven Development (EDD) made the rounds, and it deserves serious attention. The core argument is compelling: AI agents produce code faster than humans can meaningfully review it, so we need a structured protocol for specifying intent before implementation and demanding evidence of fulfillment afterward. The human developer transitions from author to editor — from writing code to evaluating it.

That framing is right. And the EDD workflow — write expectations in plain text, let the agent implement, ask the agent to prove it, challenge the evidence, iterate — is a real improvement over the current default, which is roughly “trust and hope the CI is green.”

But EDD solves a specific problem: the gap between human intention and AI implementation. It does not solve the problem that comes next.

To make this concrete, picture a developer asking an agent to fix how discount codes are applied at checkout. The expectation is precise: discounts apply to the pre-tax subtotal, tax is calculated after, an empty cart returns zero rather than an error. The agent implements it, runs the test suite, and produces evidence — three scenarios with real numbers, matching exactly what was specified. The developer reviews the evidence adversarially, pushes back once on a stacked-discount edge case, gets a revised version, and is satisfied. This is EDD working exactly as intended.

Where EDD Stops

EDD ends when the developer is satisfied with the evidence. The expectation has been met. The code works. The diff is ready.

What happens after that?

In most teams, the answer is: it gets merged. Maybe a colleague glances at the diff. Maybe not. The CI is green, the expectations were verified (at least in the agent’s own estimation), and the code lands in the main branch.

In our example, the discount fix touched a shared PricingEngine interface — the same one the inventory team’s reservation logic depends on. Nobody chose to ignore that. It simply wasn’t part of the expectation. The expectation was about discounts and tax, not about who else reads from that interface. Three weeks later, a reservation bug surfaces that takes two days to trace back to this merge.

This is precisely where a different problem begins — one I have called the last mile problem: the distance between finished code and trusted repository.

EDD is, at its core, an awareness tool. It makes the developer better informed about whether the code fulfills its stated intent. But awareness tools have a structural limitation: their effectiveness depends entirely on whether someone acts on what they now know. A very thorough EDD process can still produce a merge that silently violates an architectural boundary — not because the code is wrong, but because the expectations never captured the right constraints.

Nobody wrote an expectation that said: “This change must not modify a shared interface that three other teams depend on without their knowledge.” That kind of constraint does not live in the feature spec. It lives in the structure of the codebase.

Think of the difference as a receptionist versus a turnstile. A receptionist notices if someone heading into a restricted area looks like they don’t belong, and says something. Whether that visitor stops depends entirely on whether they care to listen. A turnstile does not notice anything — it simply does not open without the right badge. EDD, however thorough, is a receptionist. It can flag, advise, and warn. It cannot stop a determined merge.

Two Different Problems, Two Different Tools

It is worth being precise about the distinction here, because conflating these two problems leads to architectures that feel complete but have a hidden gap.

Specification problems ask: Does the code do what I intended? EDD addresses this. It forces developers to articulate intent before implementation and to demand evidence that the intent was fulfilled. This is valuable, and it was genuinely hard to do well before AI agents made the evidence-gathering step tractable.

Coordination problems ask: Does this change affect something that belongs to someone else? No amount of expectation-writing resolves this, because the affected parties are not in the room. The constraint is not derivable from the feature spec alone. It requires knowledge of the codebase’s actual coupling structure — which files have historically changed together, which teams own which components, where the real boundaries are.

Back to the checkout example: the developer writing the discount expectation has never spoken to anyone on the inventory team. There is no reason they would think to. The two features look unrelated from where they sit. The relationship only becomes visible from the outside — from the pattern of how these files have actually been touched over time.

EDD is designed for the first problem. It is not designed for the second. Applying it to the second produces a feeling of rigor without the substance.

What the Repository Already Knows

The good news is that coordination problems leave traces.

When two components are genuinely coupled — when changing one reliably requires changing the other — that pattern shows up in the commit history. Files that have been modified together repeatedly across time exhibit change coupling: a data signal derived not from someone’s opinion about the architecture, but from the actual history of how the codebase evolved.

A seismograph does not predict an earthquake by reasoning about plate tectonics from first principles. It records vibration, and the pattern of past vibration tells you something real about where the next one is likely to originate. Change coupling works the same way: it does not need to understand why PricingEngine and the inventory reservation logic are related. It only needs to notice that, across forty prior commits, they have moved together twenty-three times. That is enough to raise a flag worth taking seriously — in our checkout example, exactly the flag that would have caught the discount change before it shipped.

This matters because change coupling identifies coordination boundaries that static rules cannot. Not every interface change is equally consequential. Not every shared file is a real dependency. But when a proposed change touches files that have historically co-changed with files owned by a different team, the repository itself is telling you that this area has not historically been a local decision.

That distinction — between an advisory generated from a prompt and a trigger generated from data — is the difference between awareness and governance. One is an opinion about what might matter. The other is evidence about what has mattered.

The Two Halves of the Loop

Putting this together, the full workflow for AI-assisted development looks like a loop with two distinct halves:

First half (EDD): Specify intent → agent implements → agent proves → human challenges → iterate to convergence. This closes the gap between what the developer wanted and what the agent produced. It is a conversation between a human and an AI about a single feature.

Second half (Change Coupling + Governance): Before merge, check whether the change crosses an ownership boundary that the repository’s history suggests is real. If it does, trigger a coordination step — not as a suggestion, but as a requirement. This closes the gap between what the agent produced and what the repository can safely absorb.

Neither half replaces the other. EDD without governance produces well-specified code that still merges silently across team boundaries. Governance without EDD produces gates that catch coordination problems but does nothing about specification problems. Together, they address the full distance from intent to repository.

Replayed with both halves in place, the checkout example ends differently. The discount logic is specified and verified exactly as before — EDD did its job well. But before merge, the coupling check sees that PricingEngine has co-changed with the inventory reservation files repeatedly in the past. The merge does not proceed silently. It routes to a short conversation with the inventory team — not because a rule said “interfaces require review,” but because the repository’s own history said this particular interface does.

The Deeper Point

Laforgia is honest about one of EDD’s core weaknesses: the fox-guarding-the-henhouse problem. The same AI that wrote the code produces the evidence that the code works. The same reasoning flaw that caused a bug will likely cause the agent to overlook the bug in its evidence.

The mitigation EDD proposes — adversarial review, execution receipts, spot-checking — is the right answer for the specification half of the loop. It puts the human back in the role of critic rather than passive receiver.

But the governance half of the loop has its own version of this problem. If the gate is based on static rules — interface changes always require review — then developers learn to route around it. The rules are too blunt. The process overhead is too high. The gate becomes theater.

The alternative is a gate grounded in something the codebase itself produced: coupling patterns derived from the actual commit history, mapped to the actual ownership structure. This is not smarter than a rule. It is just harder to argue with. When the repository’s own history says that changes in this area have not historically been local decisions, that is not an opinion. That is data.

The last mile is not a place for better prompts. It is a place for better data — and for making sure that data has teeth.


This post extends an earlier piece on the last mile problem in AI-assisted development. Calyntro surfaces change coupling patterns from Git history and maps them to team ownership — turning the repository’s own history into a governance signal. Explore the live demo against the MongoDB open-source repository.