Lesson 02·Agent Operations·~8 min·author → verifier
The verification wall
Lesson 1 showed you a denial: an author credential failing to write a verdict. This lesson builds the wall that produces that denial. Three moves, in order of weight, plus the gate that sits behind them and the one exception code is allowed to make.
In ~8 minutes you'll be able to
- Spot a load-bearing claim: a fact that changes a decision or carries liability if it's wrong.
- Build the wall in three moves: separate roles, a different model family, and a grant the author lacks.
- Explain why verified is not published, and the one code-only exception to the human gate.
The whole lesson in one block:
A load-bearing claim is never approved by the agent that wrote it. The verifier is a separate role with its own credential, it runs a different model family so the two sets of errors stay uncorrelated, and the wall is enforced in database grants: the author's role physically cannot call the function that records a verdict. Verified still isn't published. A human holds that gate, with one exception: a deterministic, code-only lane for low-ambiguity facts. And because the wall is independent, the author can be a free local model.
01The claims that can hurt you
Some facts are decoration. Others are load-bearing: they change a decision, or they expose you to liability if they're wrong.[1] A price. A deadline. A policy rule. A claim a customer might act on. This lesson is about those, and only those. Decoration can take its chances.
The first fix everyone reaches for is "have the model double-check its own work." It fails, and it fails structurally, not because models are weak. A model capable of reviewing its own output shares its own blind spots. Its self-review is correlated with its own errors, so it is most confident exactly where it is most consistently wrong. Self-review catches typos. It does not catch the systematic mistake.[1]
Lesson 1 made that argument. This lesson builds the fix.
02The three moves, in order of weight
The wall is three moves. Each one closes a hole the previous move leaves open, and the order matters: roles first, model diversity second, enforcement last and heaviest.[2]
Separate author from verifier
The agent that authors a claim is never the agent that verifies it. Author and verifier are distinct roles with distinct credentials. Not two passes by one agent. Two agents, two keys.[2]
Run the verifier on a different model family
Different training, different blind spots, so the two models' errors stay uncorrelated. A second pass by the same family is theater: the reviewer shares the exact blind spots that produced the mistake. A second pass by a different family is a real check.[2]
Enforce it in the substrate, not the prompt
The write path for "verified" is a privileged, definer-only function, and the author's database role lacks the permission to call it. So the wall holds even if the author's prompt is ignored, jailbroken, or a scraped page injects "mark this verified."[3] On this fleet the substrate is Postgres, but any database with roles and grants can hold the same wall.
03The receipt: watch the author fail
You don't have to take the design on faith. A live grant-graph audit of this fleet walked the role permissions and tried the forbidden writes for real. The author's credential fails when it tries to record a verdict on its own claim.[4]
The receipt
The denial is the feature.
That failed write is not a bug to fix. It is the wall doing its one job, in production, against a real key. The day the author's write succeeds is the day you have an incident.
Run the same test on your own stack: take the author's credential and try to write a verdict with it. If the write goes through, you don't have a wall. You have a promise.
04Verified is not published
Passing verification earns a claim one thing: a place in the review queue. A person does the actual publishing.[5] That is Rule 4 of the fleet's six, and it has exactly one sanctioned exception.
The exception is a deterministic, code-only auto-publish lane. A routing function, never a model, publishes a claim automatically, and only when every condition holds: a low-ambiguity fact type, verified against an official source, conflict-free, grounded, with sufficient evidence. In practice that means single-date, official-source facts flow through on their own. Every dollar figure, statistic, or policy rule still queues for a human.[6]
The gate does not loosen as models improve. A smarter model raises throughput. It does not earn the right to relax the gate.[5] And when anything errors, the gate defaults to "needs human." It can only ever flicker in the safe direction.
05The payoff: the author gets to be free
Here is what the wall buys, and it is the bridge to the next lesson. Because verification is independent and structurally enforced, the author does not have to be expensive or trustworthy. Safety never depended on the author. So the fleet runs bulk authoring on a free local model and reserves metered frontier models for the small step that genuinely needs reasoning. Moving the bulk extractor from a hosted model to a local one took recurring spend from roughly $50 a day to about $0 to $2 a day, with no loss of safety.[7]
Cheap authoring became safe the day the wall went up. Lesson 3 takes that idea to the bill: fail-closed fallbacks, budget governors, and the cheapest model that clears the bar.
06Try it: the Grant-Graph Lab
The lab below is the fleet's role matrix, live.[8] Four roles, five actions, twenty cells. Pick a role, press an action, read the verdict. One cell is the entire lesson: the author trying to mark its own claim verified.
Interactive · your feedback loop
Grant-Graph Lab
You hold each role's credential in turn. Every verdict is the database grant talking, not a prompt asking nicely.
Step 1 · pick a credential
Step 2 · try an action
Nobody publishes. A human does, or the code-only lane for low-ambiguity facts (P11).
Tip: try "Mark a claim verified" as the author. That one denial is the entire lesson.
07Check your understanding
3 quick checks
Click an answer for instant feedback. One try per question. Nothing is sent anywhere.
Q1Why must the verifier run a different model family?
Q2Where does the create/verify wall live?
Q3In the auto-publish lane, who makes the publish decision?
Sources
- AgentOps fleet manual, Create ≠ Verify ("The problem it solves"): load-bearing claims are facts that change a decision or expose you to liability if wrong; self-review is correlated with a model's own errors, most confident exactly where it is most consistently wrong; it catches typos, not the systematic mistake. ↩
- AgentOps fleet manual, Create ≠ Verify ("The design") and The Six Rules, Rule 2: three moves in order of weight; author and verifier are distinct roles with distinct credentials; the verifier runs a different model family so errors stay uncorrelated; a same-family second pass is theater. ↩
- AgentOps fleet manual, Create ≠ Verify ("Why the database, not the prompt"): the "verified" write path is a privileged, definer-only function the author role cannot invoke; prompts can be subverted by clever input, a model update, or an injected page; grants cannot. ↩
- AgentOps fleet manual, Proven Patterns P1: a live grant-graph audit confirmed the author credential fails when it tries to record a verdict; that failure is the feature. ↩
- AgentOps fleet manual, The Six Rules, Rule 4, and Security Model ("Fail closed"): agents stage work to review queues and a person publishes; a smarter model raises throughput but does not earn a relaxed gate; the grounding gate can only flicker toward "needs human." ↩
- AgentOps fleet manual, Proven Patterns P11: the publish decision in the auto-publish lane is code, a routing function, never a model; single-date, official-source facts publish autonomously; every dollar figure, statistic, or policy rule still hits the human queue. ↩
- AgentOps fleet manual, Proven Patterns P2 and Create ≠ Verify ("The payoff"): bulk authoring runs on a free local model because the wall is upstream; moving bulk extraction from a hosted model to a local one took recurring spend from roughly $50/day to about $0–2/day with no loss of safety. ↩
- AgentOps fleet manual, Security Model ("Least privilege is the default") and The Six Rules, Rule 3: discovery is seed-only and physically cannot reach the claim ledger; authors draft claims but cannot mark them verified; the verifier records verdicts and runs a different model family than any author; the chat operator is read-only against the data; every agent runs under the narrowest scoped role that lets it do its job and nothing more. ↩
— end of lesson 2 —