Full Decision Cycle

/q-reason loads a comprehensive prompt with full knowledge of all quint-code tools and the FPF methodology. The agent then drives the conversation, calling the right tools at each step. In standard or deep mode, it goes through the full cycle.

But we recommend driving each step manually — at least until the workflow becomes second nature. Each command loads its own focused prompt with specific guidance for that step. The structured thinking process works the same way for you as it does for the AI: defining the problem first forces clarity, setting criteria before seeing options prevents bias, and the verification gate catches weak reasoning.

/q-frame — Define the problem

Most engineering mistakes happen because the problem was never properly stated. "We need caching" is not a problem — it's a solution in search of a problem. The actual problem might be "API response time exceeds 500ms on the product listing page."

/q-frame [describe what's wrong]

The agent creates a ProblemCard with:

Signal — the anomalous observation (not the assumed cause)
Constraints — hard limits that any solution must respect
Optimization targets — what to improve (1-3 max)
Observation indicators — what to monitor but NOT optimize (Anti-Goodhart)
Acceptance criteria — how you'll know the problem is solved
Blast radius — what systems/teams are affected

/q-char — Define comparison criteria

Before generating options, define how you'll compare them. This prevents bias — you can't cherry-pick criteria to favor the option you already like.

/q-char

Each criterion gets a role:

constraint — hard limit, must satisfy (e.g., "latency < 100ms")
target — what you're optimizing (e.g., "throughput")
observation — watch but don't optimize. This is Anti-Goodhart: if you optimize throughput, you might silently kill reliability. Marking reliability as "observation" means you track it without letting it distort your optimization.

/q-explore — Generate variants

The agent generates 2+ genuinely different approaches. "Different" means different in kind, not degree:

Bad: "Redis vs Memcached vs in-memory cache" — three variations of the same approach
Good: "Cache layer vs CDN optimization vs query redesign" — three fundamentally different strategies

/q-explore

Each variant gets:

Description — what this approach does
Strengths — why it might work
Weakest link — what will break first (not generic "cons" — the single thing that bounds quality)
Risks — what could go wrong
Stepping stone — does this open future possibilities even if not optimal now?

The tool runs a diversity check — if two variants share more than 50% of their words, it warns "do these differ in kind, not degree?"

/q-compare — Fair comparison

Apply the criteria from /q-char to the variants from /q-explore.

/q-compare

The result is a Pareto front — the set of variants where no variant is strictly worse than another on all dimensions. This is not "pick the best" — it's "here are the non-dominated options, each sacrificing something different."

Parity matters: same inputs, same scope, same budget for all options — or the comparison is junk.

/q-decide — Record the contract

The final step: select a variant and record the decision as a contract.

/q-decide

Before recording, the agent runs a verification gate:

Strongest counter-argument against the selected variant
What would make this decision wrong in 3 months?
Is any evidence self-referential (agent's own reasoning as proof)?

If the counter-argument survives, the decision is recorded with invariants, rollback plan, refresh triggers, and an expiry date.

The protocol at a glance

/q-frame  → /q-char  → /q-explore → /q-compare → /q-decide
  what's      what       genuinely     fair         engineering
  broken?     matters?   different     comparison   contract
                         options

Decisions as test specifications

A decision record produced by the full cycle is not just documentation. It's a formal enough specification to generate tests from — directly, without any additional tooling.

Look at what a standard/deep decision contains after the full cycle:

Invariants — "warning fires for any decision with affected_files that has no baseline hashes"
Post-conditions — checklist of what must be true after implementation
Admissibility — what is NOT acceptable (negative constraints)
Acceptance criteria — from the problem frame, measurable
Affected files — exact code locations
Weakest link — what to stress-test

This is enough for any coding agent to generate property-based tests, integration tests, or module tests — without quint-code commands or special prompts. Just ask your agent:

"Read the decision record in .quint/decisions/dec-XXX.md.
Write property-based tests that verify the invariants and post-conditions
against the affected files."

The agent knows the project language (from the codebase), knows the invariants (from the decision), and knows which files to test (from affected_files). Property-based testing frameworks exist for every major language:

Go — rapid, gopter
Python — Hypothesis
Rust — proptest
TypeScript — fast-check

This is a side effect of structured reasoning, not a separate feature. The more rigorous the decision cycle, the more testable the output. A tactical note ("we use Redis") gives the agent nothing to test. A full-cycle decision with invariants, constraints, and acceptance criteria gives it a complete test specification.

The test results can then be attached as evidence (quint_decision(action="evidence")), raising R_eff with machine-verified CL3 evidence — the strongest kind in the trust model.

Notes & micro-decisions — the lightweight path
Decision lifecycle — what happens after you decide
All commands — complete reference