Explainstuff.mebeta
All concepts
AI-Driven Developmentintermediate7 min

Verifying AI Output

AI output is fast but fallible — never trust it blindly. Run it, read it, and have a skeptic try to break it.

An AI will hand you a finished change in seconds, written in a calm, authoritative voice, looking for all the world like it knows exactly what it's doing. That fluency is a trap. The model is optimizing for plausible text, and plausible is not the same as correct — it can misread the request, invent an API that doesn't exist, or quietly break something three files away. The single most important habit in AI-driven development is the one people most want to skip: never trust AI output blindly. Verify it.

Verify it like you'd verify a human

Here's the reassuring part: you already own the tools for this. You don't need a special "AI verifier" — you need the exact same checks you'd apply to a pull request from a new teammate. Run the tests. Build it. Read the diff. A human's confidence doesn't exempt their code from CI, and neither should an AI's. The bar is identical, and so is the toolkit.

The power of tests and builds is that they're objective. A failing test doesn't care how sure the model sounded; a broken build is a broken build. These checks convert a fluent paragraph of "I've implemented this correctly" into a hard yes-or-no signal. They're cheap to run and they catch the most embarrassing failures — the code that doesn't even compile, the change that breaks an existing case.

But green tests aren't the finish line. Plenty of changes pass every check and still do the wrong thing — they solve a subtly different problem, leave dead code behind, or take a shortcut you'd never accept. That's why you read the diff with your own eyes. Tests prove the code does something safely; reading proves it does what you actually meant.

How it works: gates the change must pass

Think of verification as a set of gates the change has to clear before you trust it. The objective gates — tests and the build — come first because they're fast and unambiguous. For things you can't compile, like a claim ("this is the root cause") or a review ("this code is safe"), you add a different kind of gate: an independent, even adversarial, check. Instead of asking an agent "is this right?" — which invites a rubber stamp — you task a second agent with trying to refute the first one. A skeptic with fresh eyes and no stake in the original answer finds holes a yes-man never will. The diagram below shows a change running these gates, ending either confirmed or rejected.

Run it through the gates before you trust it
trust but verify — every claim faces a check
AI change
Run tests
Build it
Skeptic agent
Confirmed
Rejected
An AI change flows through tests, a build, and an adversarial reviewer. It's only confirmed if the checks pass — otherwise it's rejected.

That adversarial reviewer is where this connects to multi-agent orchestration: the cleanest way to get an unbiased second opinion is a separate agent with its own context, so it isn't quietly anchored to whatever the first agent already decided.

Note

In our stack — Claude Code is built to verify its own work, not just produce it: it can run your test suite, build the project, and show you the diff so the objective gates happen automatically. For claims and reviews you can go further by spawning a second subagent — a separate Claude instance with a fresh context window — whose explicit job is to poke holes in the first agent's output rather than agree with it. Each agent runs on one of Anthropic's Claude models; pairing a builder with an adversarial reviewer turns "trust me" into "here's the evidence."

Trust, but verify

The old proverb fits perfectly: trust but verify. Trust the AI enough to delegate real work to it — that's the whole point of AI-driven development — but never let that trust become an excuse to skip the check. Verification isn't a sign you doubt the tool; it's the discipline that makes leaning on it safe. It's also what feeds your review gates: an artifact that's been tested, built, and adversarially reviewed is one that's actually ready to pass the gate the first time.

Key takeaways

  • AI output is fast and fluent, which makes it easy to over-trust. Speed is not correctness.
  • Verify AI work with the same tools you'd use on a human's: run the tests, build it, read the diff.
  • Objective checks — tests and builds — give a verdict that doesn't care how confident the AI sounded.
  • Reading the diff catches what passes the build but still isn't what you meant.
  • For claims and reviews, use independent (even adversarial) checks — a second agent trying to refute the first.
  • Trust but verify: treat AI as a capable collaborator whose work you always confirm.

Keep going