Decision Integrity
A decision record is only valuable if you can trust it. quint-code has multiple mechanisms to ensure decisions are honest, evidence is real, and the knowledge base stays clean.
Adversarial verification gate
Before recording any decision, the agent runs a verification check. The principle: the agent that generated the options cannot be the sole validator of those options (FPF A.12 — External Transformer Principle).
For tactical decisions (quick, reversible):
- One-line counter-argument: "The strongest argument against this decision is..."
- If the counter-argument kills the decision → back to exploring
For standard/deep decisions:
- Deductive consequences — "If this is correct, what 3 things must be true?"
- Strongest counter-argument — genuine, not a strawman
- Self-evidence check — "Is the only evidence from this same conversation?"
- Tail failure scenarios — low-probability, high-impact failure modes
- WLNK challenge — "Is the stated weakest link actually the weakest?"
Inductive measurement gate
When recording a measurement (verdict: accepted/partial/failed), quint-code checks
whether the decision has a baseline (file hashes were snapshotted). If not:
- Warning appears in the response: "No baseline found — implementation may not be verified"
- Measurement records at CL1 (0.4 penalty) instead of CL3 (no penalty)
- R_eff for unverified measurement:
max(0, 1.0 - 0.4) = 0.6— still healthy, but visibly lower than 1.0
This prevents the agent from calling measure from memory without actually
verifying the implementation — a real problem we discovered and fixed during development.
Evidence supersession
When a new measurement is recorded on a decision that already has a measurement, the old
measurement is marked verdict='superseded' and excluded from R_eff computation.
This prevents old partial measurements from permanently dragging R_eff down.
Superseded evidence stays in the database for audit — it's not deleted, just excluded from the active chain.
Note-decision deduplication
Notes and decisions serve different purposes. Notes are observations ("we use Redis here"). Decisions are contracts ("we chose Redis because X, with invariants Y and rollback Z"). When someone tries to record a note that duplicates an existing decision, quint-code catches it.
The check uses containment (not Jaccard similarity):
- >70% of note's words in a decision title → rejected with explanation
- 50-70% → warning, note still recorded
- <50% → pass silently
Same check runs note-vs-note to prevent duplicate notes accumulating.
Batch cleanup: /q-refresh action="reconcile" scans all active notes against all
active decisions in one pass and reports overlaps.
From decisions to machine-verified evidence
The integrity mechanisms above ensure decisions are honest at recording time. But the strongest evidence comes from machine verification — tests that prove invariants hold against real code.
Decision records from the full cycle contain enough structure to serve as test specifications: invariants, post-conditions, admissibility constraints, and affected file paths. Any coding agent can translate these into property-based tests and attach the results as CL3 evidence — the highest confidence level in the R_eff model.
This creates a closed loop: decisions define what must hold, tests verify that it holds, evidence records the verification, and R_eff reflects the trust level. When code drifts (files change after baseline), R_eff drops, and the tests need to re-run. The integrity system doesn't just check that decisions were made honestly — it checks that reality still matches the decision.