Multi-agent quorum: 2-of-3 distinct LLM backends required for irreversible actions
By Jason, Founder · Published · 2 min read · Wave 292
Summary
Wave 292H ships the multi-agent quorum policy: any irreversible action above $50K (or category-flagged) requires 2-of-3 consensus from distinct LLM backends — Claude, Gemini, GPT-4. Single-vendor failure cannot ship a contract sign or escrow release.
Article body
Patent provisional 02 (Wave 294A) describes the design: an irreversible action — a $50K+ approval, a contract signing, an escrow release, a takedown of a contractor — must be approved by 2 of 3 distinct LLM backends, and the three backends must be different vendors. We pick from Claude, Gemini, GPT-4. If two of them disagree, we escalate to a human (Jason).
The threat model is straightforward. A single-vendor LLM has a single failure mode: prompt injection, jailbreak, model bug, or off-policy output. If a contract drafter (Wave 292K canary) hallucinates a $250K change order and signs it, AskBaily owes the money. The mitigation is to require independent agreement across vendors before any irreversible action commits. If Claude and Gemini both think the change order is valid, our confidence is materially higher than if Claude alone says so. If they disagree, the action does not commit and Jason's phone rings.
The implementation is in lib/agents/quorum/. policies.ts declares the action types (currently 5: contract_sign, escrow_release, contractor_takedown, refund_above_50k, dispute_resolution_above_25k) and the consensus threshold per type. coordinator.ts is the dispatcher: receive an action, compose the prompt, fan out to the three backends, collect responses, score consensus, return APPROVE / REJECT / ESCALATE_HUMAN. voter-stub.ts is the per-backend wrapper; in production it calls the real model APIs, in tests it is mocked deterministically. escalation.ts defines what happens on ESCALATE_HUMAN — page Jason via SMS, log to /admin/quorum-queue, do not commit the action.
The cost model: we pay 3x for any irreversible action vs. a single-vendor call. That is acceptable because irreversible actions are <0.5 percent of agent calls. Routine matching, scoping, content drafting, and L1 support all run single-vendor (Claude or Gemini) at standard cost. Only the gating layer for the $50K+ category pays the 3x premium. At a 100-action-per-month volume, the cost premium is low five figures per year, vs. catastrophic single-vendor failure exposure.
We also enforce vendor independence at the infrastructure layer. The three backends call through three different API gateways with three different auth credentials. A single API-key compromise does not bypass the quorum because the compromised key only authenticates one of the three votes. The quorum still requires the other two.
47 tests cover the consensus math (2-of-3, 3-of-3 strict, escalation paths), the policy registry (every action type has a policy), the voter-stub (deterministic test mode), and the escalation routing (Jason gets the SMS). The suite mocks the LLM calls so it runs offline; production calls are billed against the agent fleet cost governor (Wave 9.6).
The strategic frame is in patent provisional 02. AskBaily is the first marketplace publishing a multi-vendor quorum gate as a primary safety primitive. Angi, Thumbtack, HomeAdvisor, and Houzz route irreversible actions through human review. We route them through algorithmic consensus and only escalate to human on disagreement. The result is faster decisions on uncontested actions and safer decisions on borderline ones.
Sources & references
Commit attestation
- 9d84ccb58c4b7e3f9d2a8b1c5e7f6a3d2b1c8e9f
- Tests green
- 47
- Files changed
- 7
- Lines added
- 824
- Waves
- 292
- Author
- jason
Commit SHAs are from the AskBaily private repository. If you are a journalist, researcher, or regulator and need access to verify, email [email protected].
Frequently asked
- What action types require quorum?
- Five today: contract_sign, escrow_release above $50K, contractor_takedown, refund above $50K, dispute resolution above $25K. The list grows as new agent capabilities ship. Any action under the threshold runs single-vendor at standard latency.
- What happens if all three backends disagree?
- Escalation to Jason via SMS plus a /admin/quorum-queue entry. The action does not commit. We log the disagreement with full prompts and responses so we can audit which backend was off-policy and why.
- Why three vendors and not two?
- Two vendors give you parity but no tie-breaker; if they disagree you have no way to decide which is right. Three vendors give a 2-of-3 majority and an explicit minority dissent. The minority dissent is itself a signal — if Claude and Gemini agree but GPT-4 dissents, we capture that for offline review even when the action commits.