Lesson 2 of 5

Lesson 02·Agent Operations·~8 min·author → verifier

The verification wall

Lesson 1 showed you a denial: an author credential failing to write a verdict. This lesson builds the wall that produces that denial. Three moves, in order of weight, plus the gate that sits behind them and the one exception code is allowed to make.

🎯 Why this is Lesson 2. The first failure mode is the lie: a wrong claim that gets approved. The wall in this lesson is the single most important pattern in the fleet, because everything else, cheap bulk authoring, autonomous throughput, trusting a local model with real work, is only safe because this wall exists.

In ~8 minutes you'll be able to

Spot a load-bearing claim: a fact that changes a decision or carries liability if it's wrong.
Build the wall in three moves: separate roles, a different model family, and a grant the author lacks.
Explain why verified is not published, and the one code-only exception to the human gate.

The whole lesson in one block:

The answer, up front

A load-bearing claim is never approved by the agent that wrote it. The verifier is a separate role with its own credential, it runs a different model family so the two sets of errors stay uncorrelated, and the wall is enforced in database grants: the author's role physically cannot call the function that records a verdict. Verified still isn't published. A human holds that gate, with one exception: a deterministic, code-only lane for low-ambiguity facts. And because the wall is independent, the author can be a free local model.

01The claims that can hurt you

Some facts are decoration. Others are load-bearing: they change a decision, or they expose you to liability if they're wrong.^[1] A price. A deadline. A policy rule. A claim a customer might act on. This lesson is about those, and only those. Decoration can take its chances.

The first fix everyone reaches for is "have the model double-check its own work." It fails, and it fails structurally, not because models are weak. A model capable of reviewing its own output shares its own blind spots. Its self-review is correlated with its own errors, so it is most confident exactly where it is most consistently wrong. Self-review catches typos. It does not catch the systematic mistake.^[1]

Lesson 1 made that argument. This lesson builds the fix.

02The three moves, in order of weight

The wall is three moves. Each one closes a hole the previous move leaves open, and the order matters: roles first, model diversity second, enforcement last and heaviest.^[2]

Roles and credentials

Separate author from verifier

The agent that authors a claim is never the agent that verifies it. Author and verifier are distinct roles with distinct credentials. Not two passes by one agent. Two agents, two keys.^[2]

What this buysAn approval now requires a second, independent actor. One compromised or confused agent cannot complete the loop alone.

Uncorrelated errors

Run the verifier on a different model family

Different training, different blind spots, so the two models' errors stay uncorrelated. A second pass by the same family is theater: the reviewer shares the exact blind spots that produced the mistake. A second pass by a different family is a real check.^[2]

What this buysThe systematic error. The mistake the author makes every time is the one a same-family reviewer also misses every time.

Grants, not requests

Enforce it in the substrate, not the prompt

The write path for "verified" is a privileged, definer-only function, and the author's database role lacks the permission to call it. So the wall holds even if the author's prompt is ignored, jailbroken, or a scraped page injects "mark this verified."^[3] On this fleet the substrate is Postgres, but any database with roles and grants can hold the same wall.

What this buysSurvival. A prompt is a request; a grant is a law. The grant outlives the failure of every softer control.

The author's credential physically cannot write the verified state. Only the independent verifier can, and a human (or a deterministic code-only lane for low-ambiguity facts) does the publishing. (AgentOps fleet manual, P1/P11; Rules 2 and 4)

03The receipt: watch the author fail

You don't have to take the design on faith. A live grant-graph audit of this fleet walked the role permissions and tried the forbidden writes for real. The author's credential fails when it tries to record a verdict on its own claim.^[4]

The receipt

The denial is the feature.

That failed write is not a bug to fix. It is the wall doing its one job, in production, against a real key. The day the author's write succeeds is the day you have an incident.

Run the same test on your own stack: take the author's credential and try to write a verdict with it. If the write goes through, you don't have a wall. You have a promise.

04Verified is not published

Passing verification earns a claim one thing: a place in the review queue. A person does the actual publishing.^[5] That is Rule 4 of the fleet's six, and it has exactly one sanctioned exception.

The exception is a deterministic, code-only auto-publish lane. A routing function, never a model, publishes a claim automatically, and only when every condition holds: a low-ambiguity fact type, verified against an official source, conflict-free, grounded, with sufficient evidence. In practice that means single-date, official-source facts flow through on their own. Every dollar figure, statistic, or policy rule still queues for a human.^[6]

The gate does not loosen as models improve. A smarter model raises throughput. It does not earn the right to relax the gate.^[5] And when anything errors, the gate defaults to "needs human." It can only ever flicker in the safe direction.

05The payoff: the author gets to be free

Here is what the wall buys, and it is the bridge to the next lesson. Because verification is independent and structurally enforced, the author does not have to be expensive or trustworthy. Safety never depended on the author. So the fleet runs bulk authoring on a free local model and reserves metered frontier models for the small step that genuinely needs reasoning. Moving the bulk extractor from a hosted model to a local one took recurring spend from roughly $50 a day to about $0 to $2 a day, with no loss of safety.^[7]

Cheap authoring became safe the day the wall went up. Lesson 3 takes that idea to the bill: fail-closed fallbacks, budget governors, and the cheapest model that clears the bar.

06Try it: the Grant-Graph Lab

The lab below is the fleet's role matrix, live.^[8] Four roles, five actions, twenty cells. Pick a role, press an action, read the verdict. One cell is the entire lesson: the author trying to mark its own claim verified.

Interactive · your feedback loop

Grant-Graph Lab

You hold each role's credential in turn. Every verdict is the database grant talking, not a prompt asking nicely.

Step 1 · pick a credential

Step 2 · try an action

You hold the author's credential. Try an action.

Nobody publishes. A human does, or the code-only lane for low-ambiguity facts (P11).

Tip: try "Mark a claim verified" as the author. That one denial is the entire lesson.

07Check your understanding

3 quick checks

Click an answer for instant feedback. One try per question. Nothing is sent anywhere.

Q1Why must the verifier run a different model family?

Different training means different blind spots. A second pass by the same family shares the exact blind spots that produced the mistake, which makes it theater, not a check.^[2]

Q2Where does the create/verify wall live?

The "verified" write path is a definer-only function the author's role cannot call. A prompt or a comment is a request; the grant is a law.^[3]

Q3In the auto-publish lane, who makes the publish decision?

The lane publishes only when every condition holds, and the decision is a routing function, never a model. Anything ambiguous or load-bearing stays human-gated.^[6]

Next up (Lesson 3): The Money Lesson. The wall makes cheap authoring safe; Lesson 3 makes it cheap on purpose. Empty fallbacks so a dead local model stalls loudly instead of quietly billing a hosted one, a budget governor that refuses to dispatch when it can't read the balance, and the cheapest model that clears the bar.

Sources

AgentOps fleet manual, Create ≠ Verify ("The problem it solves"): load-bearing claims are facts that change a decision or expose you to liability if wrong; self-review is correlated with a model's own errors, most confident exactly where it is most consistently wrong; it catches typos, not the systematic mistake. ↩
AgentOps fleet manual, Create ≠ Verify ("The design") and The Six Rules, Rule 2: three moves in order of weight; author and verifier are distinct roles with distinct credentials; the verifier runs a different model family so errors stay uncorrelated; a same-family second pass is theater. ↩
AgentOps fleet manual, Create ≠ Verify ("Why the database, not the prompt"): the "verified" write path is a privileged, definer-only function the author role cannot invoke; prompts can be subverted by clever input, a model update, or an injected page; grants cannot. ↩
AgentOps fleet manual, Proven Patterns P1: a live grant-graph audit confirmed the author credential fails when it tries to record a verdict; that failure is the feature. ↩
AgentOps fleet manual, The Six Rules, Rule 4, and Security Model ("Fail closed"): agents stage work to review queues and a person publishes; a smarter model raises throughput but does not earn a relaxed gate; the grounding gate can only flicker toward "needs human." ↩
AgentOps fleet manual, Proven Patterns P11: the publish decision in the auto-publish lane is code, a routing function, never a model; single-date, official-source facts publish autonomously; every dollar figure, statistic, or policy rule still hits the human queue. ↩
AgentOps fleet manual, Proven Patterns P2 and Create ≠ Verify ("The payoff"): bulk authoring runs on a free local model because the wall is upstream; moving bulk extraction from a hosted model to a local one took recurring spend from roughly $50/day to about $0–2/day with no loss of safety. ↩
AgentOps fleet manual, Security Model ("Least privilege is the default") and The Six Rules, Rule 3: discovery is seed-only and physically cannot reach the claim ledger; authors draft claims but cannot mark them verified; the verifier records verdicts and runs a different model family than any author; the chat operator is read-only against the data; every agent runs under the narrowest scoped role that lets it do its job and nothing more. ↩

— end of lesson 2 —