Decision Integrity

A decision record is only valuable if you can trust it. quint-code has multiple mechanisms to ensure decisions are honest, evidence is real, and the knowledge base stays clean.

Adversarial verification gate

Before recording any decision, the agent runs a verification check. The principle: the agent that generated the options cannot be the sole validator of those options (FPF A.12 — External Transformer Principle).

For tactical decisions (quick, reversible):

  • One-line counter-argument: "The strongest argument against this decision is..."
  • If the counter-argument kills the decision → back to exploring

For standard/deep decisions:

  1. Deductive consequences — "If this is correct, what 3 things must be true?"
  2. Strongest counter-argument — genuine, not a strawman
  3. Self-evidence check — "Is the only evidence from this same conversation?"
  4. Tail failure scenarios — low-probability, high-impact failure modes
  5. WLNK challenge — "Is the stated weakest link actually the weakest?"

Inductive measurement gate

When recording a measurement (verdict: accepted/partial/failed), quint-code checks whether the decision has a baseline (file hashes were snapshotted). If not:

  • Warning appears in the response: "No baseline found — implementation may not be verified"
  • Measurement records at CL1 (0.4 penalty) instead of CL3 (no penalty)
  • R_eff for unverified measurement: max(0, 1.0 - 0.4) = 0.6 — still healthy, but visibly lower than 1.0

This prevents the agent from calling measure from memory without actually verifying the implementation — a real problem we discovered and fixed during development.

Evidence supersession

When a new measurement is recorded on a decision that already has a measurement, the old measurement is marked verdict='superseded' and excluded from R_eff computation. This prevents old partial measurements from permanently dragging R_eff down.

Superseded evidence stays in the database for audit — it's not deleted, just excluded from the active chain.

Note-decision deduplication

Notes and decisions serve different purposes. Notes are observations ("we use Redis here"). Decisions are contracts ("we chose Redis because X, with invariants Y and rollback Z"). When someone tries to record a note that duplicates an existing decision, quint-code catches it.

The check uses containment (not Jaccard similarity):

  • >70% of note's words in a decision title → rejected with explanation
  • 50-70% → warning, note still recorded
  • <50% → pass silently

Same check runs note-vs-note to prevent duplicate notes accumulating.

Batch cleanup: /q-refresh action="reconcile" scans all active notes against all active decisions in one pass and reports overlaps.

From decisions to machine-verified evidence

The integrity mechanisms above ensure decisions are honest at recording time. But the strongest evidence comes from machine verification — tests that prove invariants hold against real code.

Decision records from the full cycle contain enough structure to serve as test specifications: invariants, post-conditions, admissibility constraints, and affected file paths. Any coding agent can translate these into property-based tests and attach the results as CL3 evidence — the highest confidence level in the R_eff model.

This creates a closed loop: decisions define what must hold, tests verify that it holds, evidence records the verification, and R_eff reflects the trust level. When code drifts (files change after baseline), R_eff drops, and the tests need to re-run. The integrity system doesn't just check that decisions were made honestly — it checks that reality still matches the decision.