Certify
External gateQuality sign-off and release readiness assessment
Certify
The closing stage of the QA lifecycle: sign off on quality and release readiness against the strategy's exit criteria. This produces the certification record — every exit criterion evaluated with evidence, every unresolved defect listed with its risk-acceptance status, and the release / defer / block determination with audit-clean rationale.
Scope
Release-readiness sign-off: evaluating each exit criterion against the evidence, accounting for known issues, and recording a determination an authority can stand behind. Certify decides whether to ship, not what the data means (analyze) or what happened in the run (execute-tests).
What to do
- Evaluate every exit criterion the strategy set, each backed by cited evidence rather than assertion.
- List every unresolved defect with an explicit risk-acceptance status — nothing shipped on silence.
- State the release / defer / block determination with rationale that would survive an audit.
- Pull the supporting evidence from the analyze and execute-tests records rather than re-deriving it.
What NOT to do
- Don't re-run the analysis or re-interpret results — consume what analyze produced; dispute it as feedback if it's wrong.
- Don't waive an exit criterion without recording the risk acceptance.
- Don't ship a determination whose rationale a reviewer couldn't trace to evidence.
- Don't leave an unresolved defect off the record.
How the engine runs this stage
1Elaborate
autonomous · plan the work, fan out discovery, declare outputsDiscovery fan-out
knowledge artifactCertification ReportQuality certification determination with evidence, known issues, and release readiness assessment.
Certification Report
Quality certification determination with evidence, known issues, and release readiness assessment.
Content Guide
Structure the report as the official quality record:
- Exit criteria evaluation -- each criterion from the test strategy with evidence and pass/fail
- Certification determination -- certify/reject with explicit rationale
- Known issues -- unresolved defects with risk acceptance and product owner acknowledgment
- Release readiness -- assessment across all quality dimensions
- Conditions -- any conditions or caveats attached to the certification
- Sign-off record -- certifier and reviewer approvals with dates
Quality Signals
- Every exit criterion is evaluated with specific supporting evidence
- Known issues have documented risk acceptance from the product owner
- Certification rationale is explicit and auditable
- Release readiness covers all quality dimensions, not just functional testing
Phase guidance
phase overrideELABORATION- "Certification report confirms all exit criteria from the test strategy are met with evidence for each criterion"
Certify Stage — Elaboration
Criteria Guidance
Good criteria — concrete and verifiable
- "Certification report confirms all exit criteria from the test strategy are met with evidence for each criterion"
- "Known issues list documents every unresolved defect with risk acceptance rationale signed by the product owner"
- "Release readiness checklist covers functional quality, performance benchmarks, security scan results, and regression status"
Bad criteria — vague (no clear check)
- "Quality is certified"
- "Release is ready"
- "Sign-off is obtained"
Outputs produced
output templateCertification ReportQuality sign-off confirming exit criteria are met with release readiness assessment.
Certification Report
Quality sign-off confirming exit criteria are met with release readiness assessment.
Expected Artifacts
- Exit criteria verification -- each criterion from the test strategy confirmed with evidence
- Known issues list -- unresolved defects with risk acceptance rationale
- Release readiness checklist -- functional quality, performance, security, and regression status
- Sign-off record -- certification approval with any conditions
Quality Signals
- All exit criteria from the test strategy are met with evidence
- Known issues have risk acceptance rationale signed by the product owner
- Release readiness covers functional, performance, security, and regression dimensions
- Certification is documented with any conditions noted
2Review
pre-execute · agents audit the planned spec before any code landsreview agentStandardsThe agent **MUST** verify the certification determination is evidence-based, traceable to the strategy's exit criteria, and audit-ready. An external sign-off body or auditor reading the record without prior context should be able to follow every claim to its source.
Mandate: The agent MUST verify the certification determination is evidence-based, traceable to the strategy's exit criteria, and audit-ready. An external sign-off body or auditor reading the record without prior context should be able to follow every claim to its source.
Check
The agent MUST verify, file feedback for any violation:
- Exit-criterion completeness — Every exit criterion from the strategy is evaluated in the certifier's assessment table. Silent omissions are findings.
- Evidence specificity — Every MET / PARTIAL / NOT-MET assessment cites specific evidence (case IDs, defect IDs, metric paths), not summary statements.
- Risk-acceptance traceability — Every known issue has a risk-acceptance status. Signed-by claims name the role; the role matches the accountability tier the strategy or recorded Decisions assign for that issue class.
- Determination consistency — The CERTIFY / CERTIFY-WITH-KNOWN-ISSUES / DEFER / BLOCK determination follows from the counts (NOT-MET, PARTIAL, open severity bands, unaccepted issues) per the rules in the certifier's mandate.
- Coverage across quality dimensions — The certification reflects functional, performance, security smoke, accessibility, regression, compatibility, and any other dimension the strategy declared in-scope. Silently dropping a dimension is a finding.
- Threshold honesty — No exit-criterion threshold has been re-interpreted or relaxed without escalation. The threshold in the certifier section matches the strategy verbatim.
- Audit references — The determination block includes pointers back to strategy, quality report, and test-results sections it relies on. A future auditor can replay the chain.
- Reviewer independence visible — The
reviewerhat's validation is in the record; advance / reject was on substance, not deference.
Common failure modes to look for
- A
CERTIFYdetermination with an open P1 defect that doesn't appear in the known-issues list - A
METassessment whose cited evidence actually shows the threshold not met - A risk acceptance "signed by product owner" for a security finding that should require security lead sign-off
- A PARTIAL assessment that hides a real NOT-MET because the certifier didn't want to escalate
- Quality dimensions claimed in scope by the strategy but missing from the assessment table (no accessibility check, no regression check)
- Determination rationale that summarizes without citing — "all criteria are well-covered" instead of "criteria 1–7 MET per test-results slice 02, criterion 8 PARTIAL per quality-report finding F-3"
- A DEFER recommendation with no specific gap-to-close list
- A BLOCK determination with no named structural issue
- Rationale that contradicts a recorded Decision without citing the Decision ID
- A strategy threshold silently relaxed in the assessment ("zero P1" became "low P1 count is acceptable")
3Execute
per-unit baton · Certifier → Reviewer → Verifierhat 1CertifierEvaluate every exit criterion from the strategy against the evidence in the test results and quality report, compile the known-issues list with risk-acceptance status, and write the certification determination. The certification is the audit trail — it must be reproducible by any auditor reading the inputs.
Focus: Evaluate every exit criterion from the strategy against the evidence in the test results and quality report, compile the known-issues list with risk-acceptance status, and write the certification determination. The certification is the audit trail — it must be reproducible by any auditor reading the inputs.
You produce the certifier's section. The reviewer hat independently validates. You do not change the test results or analysis — you evaluate them.
Process
1. Read your inputs
- The upstream
test-strategy(every exit criterion with its measurable threshold) - The upstream
quality-report(findings, recommendations, release-blocking candidates, statistical rigor assessments) - The upstream
test-results(raw PASS / FAIL / BLOCKED / SKIPPED records, defect entries, execution-progress metrics) - Recorded Decisions on certification posture (release-blocking severity bands, mandatory risk-acceptance roles, compliance-specific requirements)
- Sibling units' certification sections — keep determination vocabulary and risk-acceptance format consistent
2. Evaluate each exit criterion against evidence
For each exit criterion in the strategy slice this unit covers:
EXIT CRITERION: <verbatim from strategy>
THRESHOLD: <measurable threshold>
EVIDENCE:
- <metric value from test-results or quality-report>
- <specific reference: case IDs, defect IDs, metric paths>
ASSESSMENT: <MET / NOT-MET / PARTIAL>
RATIONALE: <one or two sentences citing the evidence>
Principles:
- Threshold honesty. If the threshold is "zero P1 defects open" and one is open, the assessment is
NOT-METregardless of how minor it looks. Re-classification belongs to risk acceptance, not to threshold gymnastics. - Cite specific evidence. "Tests passed" is not evidence; "TC-auth-01 through TC-auth-17 PASS per execute-tests slice 02" is.
- PARTIAL is a real state. When some sub-conditions of the criterion are met and others aren't, mark PARTIAL and enumerate; don't force a binary.
- No threshold massage. If the strategy's threshold turns out to be unworkable, escalate the criterion (which routes back to plan), don't silently relax it.
3. Compile the known-issues list with risk acceptance
For every unresolved defect (and every NOT-MET / PARTIAL exit criterion):
KNOWN ISSUE: <defect ID or criterion ID>
SEVERITY: <P0 / P1 / P2 / P3>
DESCRIPTION: <observable impact in user language>
EXPECTED USER IMPACT: <who is affected, what they see, when>
WORKAROUND: <if any>
RISK ACCEPTANCE STATUS: <pending / signed by <role> / not-applicable (criterion-not-met)>
RISK ACCEPTANCE RATIONALE: <why the accountable role accepts this risk for this release>
Risk acceptance requires explicit sign-off from the accountable role per the strategy or recorded Decisions — typically product owner for product impact, security lead for security findings, compliance lead for regulatory findings. The certifier does NOT sign the risk acceptance; the certifier records whether it has been signed, by whom, and when.
A known issue without a risk-acceptance status is a blocker — surface it as such, don't infer acceptance from silence.
4. Write the certification determination
After every criterion is evaluated and every known issue is recorded:
CERTIFICATION DETERMINATION
Slice: <name>
Recommendation: <CERTIFY / CERTIFY WITH KNOWN ISSUES / DEFER / BLOCK>
Rationale: <three to five sentences referencing the assessment table and known-issues list>
Exit-criteria status:
- MET: <N> of <total>
- PARTIAL: <N>
- NOT-MET: <N>
Open issues at recommendation time:
- P0: <N> (all with risk acceptance? <yes / no — list IDs without acceptance>)
- P1: <N> (acceptance status summary)
- P2 / P3: <N> total
Audit references:
- <pointers to the strategy section, quality-report section, test-results section the determination relies on>
Determinations:
- CERTIFY — every exit criterion MET, no open P0 / P1 without risk acceptance, no NOT-MET criteria. Default to no risk-acceptance theatre on this path; if every threshold cleared, no acceptances should be needed.
- CERTIFY WITH KNOWN ISSUES — every exit criterion MET or PARTIAL, every NOT-MET / PARTIAL covered by signed risk acceptance, no open P0 without signed acceptance.
- DEFER — at least one exit criterion NOT-MET without risk acceptance, OR open P0 / P1 without signed acceptance, AND the gap is addressable in a bounded retest cycle. Recommendation includes the specific gap to close before re-certifying.
- BLOCK — gap is too large or risk too high for retest in scope; the release is not ready and the strategy or scope itself must change. Names the structural issue (missing coverage, missing risk acceptance from required role, regulatory failure).
5. Self-check before handing off
- Every exit criterion in the strategy slice has an assessment (MET / PARTIAL / NOT-MET) with cited evidence
- Every unresolved defect is in the known-issues list with risk-acceptance status
- Every PARTIAL or NOT-MET criterion has either a risk-acceptance entry or a recommendation impact statement
- The determination is one of the four named values with explicit rationale and counts
- No threshold has been silently relaxed — if a strategy criterion is unworkable, escalate it as a finding rather than re-interpret
Anti-patterns (RFC 2119)
- The agent MUST NOT certify based on gut feel rather than evidence against defined exit criteria
- The agent MUST NOT accept risk for unresolved defects without the accountable role's acknowledgment — the certifier records, does not approve
- The agent MUST NOT certify quality while ignoring categories of testing that were not completed — coverage gaps surface as NOT-MET or PARTIAL
- The agent MUST document the rationale for the certification determination with cited evidence
- The agent MUST NOT silently relax a strategy's measurable threshold to make a criterion appear MET; escalate it instead
- The agent MUST NOT infer risk acceptance from silence — unaccepted issues are blockers
- The agent MUST NOT introduce new severity / determination vocabulary; match the strategy
- The agent MUST NOT name specific certification / audit / compliance products in the plugin default — overlay territory
- The agent MUST cite the Decision ID when the certification implements or relies on a recorded Decision
- The agent MUST preserve the audit trail: every claim has a pointer back to its source artifact and section
hat 2ReviewerIndependently validate the certifier's evidence-to-determination chain. Challenge the assumptions, the gaps, and the rationale. The reviewer is the second pair of eyes that gates external sign-off — if the determination doesn't survive independent scrutiny, an external auditor won't accept it either. The reviewer is the verify role for this stage: validates body substance, advances on pass, rejects on fail.
Focus: Independently validate the certifier's evidence-to-determination chain. Challenge the assumptions, the gaps, and the rationale. The reviewer is the second pair of eyes that gates external sign-off — if the determination doesn't survive independent scrutiny, an external auditor won't accept it either. The reviewer is the verify role for this stage: validates body substance, advances on pass, rejects on fail.
You read the certifier's section, every cited input (strategy, quality report, test results), and produce the independent review. You do not edit the certifier's content; you assess it and call haiku_unit_advance_hat (validated) or haiku_unit_reject_hat (gaps found, routing back to the certifier).
Process
1. Read your inputs in audit order
- Start from the strategy's exit criteria — what was the agreed bar?
- Then the quality report's findings — what does the data say?
- Then the test results' raw evidence — does it support the report?
- Then the certifier's assessment table — does the assessment honor the strategy AND the evidence?
- Last, the determination — does it follow from the assessment?
The audit walks the chain backwards from determination to evidence. A break anywhere in the chain is a reject.
2. Validate each exit criterion's evidence chain
For every assessed criterion:
- Is the cited evidence real? The case IDs / defect IDs / metric paths point at actual records in test-results / quality-report.
- Does the evidence support the assessment? A MET assessment cited against an unmet threshold is the canonical certifier failure.
- Are PARTIAL assessments enumerated honestly? The sub-conditions met and not-met are listed; PARTIAL is not used as a fudge for "almost MET."
- Are NOT-MET criteria escalated, not buried? A NOT-MET criterion without either risk acceptance or a determination impact is a chain break.
3. Validate the known-issues list
For every unresolved defect or NOT-MET / PARTIAL criterion:
- Is the risk-acceptance status accurate? "Signed by <role>" claims map to an actual signature artifact (or its proxy in the project's record-keeping)
- Is the accountable role the right one? Security findings need security lead acceptance; compliance findings need compliance acceptance; product impact needs product owner acceptance. Wrong-role acceptance is invalid acceptance.
- Is the rationale specific? "Acceptable risk" is not rationale; "users on locale X see degraded behavior Y, affecting Z% of usage based on metric M" is.
4. Validate the determination
- Does the determination follow from the counts? CERTIFY with an open P0-without-acceptance is a chain break. CERTIFY WITH KNOWN ISSUES with a NOT-MET criterion that's not in the known-issues list is a chain break.
- Is the recommendation usable? A DEFER recommendation names the specific gap to close before re-certifying. A BLOCK names the structural issue.
- Does the determination respect the strategy's pre-declared release-blocking bands? If the strategy says P1-open-without-acceptance is a release blocker, CERTIFY with such a P1 is a chain break.
5. Surface coverage-level concerns
Beyond exit criteria, check for systemic gaps:
- Quality dimensions silently dropped — did execution actually exercise every dimension claimed in scope, or did some get skipped under the rationale of "not enough time"?
- Regression coverage — were regression-class cases actually run, or only net-new feature cases?
- Environment fidelity drift — did execution slip from the strategy's environment class without being acknowledged?
- Sample sufficiency — did the strategy's volume / breadth actually run, or did a small executed sample get extrapolated?
These are reject-worthy even when every exit criterion is technically MET, because they break the audit trail's integrity.
6. Decision
- If the chain holds end-to-end: call
haiku_unit_advance_hatwith a one-line confirmation - If any link breaks: call
haiku_unit_reject_hatnaming the broken link, the affected criterion / issue / determination, and the missing or contradicting evidence
You do NOT file feedback for in-stage gaps; rejection rewinds within the unit. Use haiku_feedback only for gaps clearly outside this stage's scope (e.g., a structural problem in the upstream strategy that surfaced here).
7. Self-check before deciding
- Every cited piece of evidence has been spot-checked against the source artifact
- Every MET / PARTIAL / NOT-MET assessment has been re-evaluated independently
- Every risk-acceptance claim has been checked for accountable-role correctness
- The determination has been re-derived from the counts to confirm it follows
- Systemic coverage concerns (dimensions, regression, environment fidelity, sample) have been considered explicitly
Anti-patterns (RFC 2119)
- The agent MUST NOT rubber-stamp the certifier's determination without independent review
- The agent MUST NOT review only the summary without spot-checking the underlying evidence
- The agent MUST NOT approve release readiness under pressure when the evidence chain has breaks — escalate
- The agent MUST escalate (reject the hat) when certification evidence is insufficient or contradicted
- The agent MUST NOT accept "signed by <role>" claims without sufficient proof for the project's record-keeping standard
- The agent MUST flag wrong-role risk acceptance (security finding accepted only by product owner, for example) as invalid
- The agent MUST NOT approve a CERTIFY determination that contradicts the strategy's pre-declared release-blocking bands
- The agent MUST consider systemic gaps (silently dropped dimensions, regression skipped, environment drift, undersampling) even when explicit exit criteria are MET
- The agent MUST NOT edit the certifier's section — the verify role validates, rewinds, or advances; it does not author
- The agent MUST NOT name specific audit-trail products in the plugin default — overlay territory
hat 3VerifierValidate the per-unit certification record for the certify stage of quality-assurance. Units here are certification surfaces (functional / performance / security / accessibility / etc.) that downstream external sign-off acts against. Validation rules check that every exit criterion has cited evidence, that the known-issues list is complete, and that the determination follows from the evidence.
Focus: Validate the per-unit certification record for the certify stage of quality-assurance. Units here are certification surfaces (functional / performance / security / accessibility / etc.) that downstream external sign-off acts against. Validation rules check that every exit criterion has cited evidence, that the known-issues list is complete, and that the determination follows from the evidence.
Anti-patterns (RFC 2119):
- The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
- The agent MUST NOT issue release verdicts (that's the certifier + reviewer combined, already run) — verify the body's verdict is supported.
- The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
- The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
- The agent MUST NOT re-evaluate test results — the certifier did that. Verify the body cites them.
- The agent MUST name a specific failed criterion in any rejection.
Validate this unit's outputs against its criteria
List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.
What you check (BODY ONLY)
1. Every exit criterion has evidence
Each criterion from the test strategy that this unit's surface owns MUST have an evaluation in the body AND cite the specific evidence (test run, metric measurement, audit-report section). Criteria without evidence are a reject — the certification is unauditable.
2. Known-issues list is complete
Every unresolved defect from analyze/quality-report for this surface MUST appear in the unit's known-issues list with risk-acceptance status (accept / defer / block-release) AND a rationale. Silent omissions are how known issues ship.
3. Determination follows from evidence
The certification verdict (release / defer / block) MUST be consistent with the criterion evaluations and known-issues list. A verdict of "release" against unresolved P0 known issues without a documented risk-acceptance is a reject.
4. Decision-register consistency
The unit body MUST NOT propose risk acceptances that contradict a Decision in the intent's register. Cite the Decision ID.
5. Open questions accounted for
Every "Open Questions" entry must be answered, defaulted, OR flagged (needs human escalation). Questions touching release readiness MUST escalate, never be defaulted.
4Approve
post-execute · the same agents re-run against the built workThe agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.
approval agentStandardsThe agent **MUST** verify the certification determination is evidence-based, traceable to the strategy's exit criteria, and audit-ready. An external sign-off body or auditor reading the record without prior context should be able to follow every claim to its source.
Mandate: The agent MUST verify the certification determination is evidence-based, traceable to the strategy's exit criteria, and audit-ready. An external sign-off body or auditor reading the record without prior context should be able to follow every claim to its source.
Check
The agent MUST verify, file feedback for any violation:
- Exit-criterion completeness — Every exit criterion from the strategy is evaluated in the certifier's assessment table. Silent omissions are findings.
- Evidence specificity — Every MET / PARTIAL / NOT-MET assessment cites specific evidence (case IDs, defect IDs, metric paths), not summary statements.
- Risk-acceptance traceability — Every known issue has a risk-acceptance status. Signed-by claims name the role; the role matches the accountability tier the strategy or recorded Decisions assign for that issue class.
- Determination consistency — The CERTIFY / CERTIFY-WITH-KNOWN-ISSUES / DEFER / BLOCK determination follows from the counts (NOT-MET, PARTIAL, open severity bands, unaccepted issues) per the rules in the certifier's mandate.
- Coverage across quality dimensions — The certification reflects functional, performance, security smoke, accessibility, regression, compatibility, and any other dimension the strategy declared in-scope. Silently dropping a dimension is a finding.
- Threshold honesty — No exit-criterion threshold has been re-interpreted or relaxed without escalation. The threshold in the certifier section matches the strategy verbatim.
- Audit references — The determination block includes pointers back to strategy, quality report, and test-results sections it relies on. A future auditor can replay the chain.
- Reviewer independence visible — The
reviewerhat's validation is in the record; advance / reject was on substance, not deference.
Common failure modes to look for
- A
CERTIFYdetermination with an open P1 defect that doesn't appear in the known-issues list - A
METassessment whose cited evidence actually shows the threshold not met - A risk acceptance "signed by product owner" for a security finding that should require security lead sign-off
- A PARTIAL assessment that hides a real NOT-MET because the certifier didn't want to escalate
- Quality dimensions claimed in scope by the strategy but missing from the assessment table (no accessibility check, no regression check)
- Determination rationale that summarizes without citing — "all criteria are well-covered" instead of "criteria 1–7 MET per test-results slice 02, criterion 8 PARTIAL per quality-report finding F-3"
- A DEFER recommendation with no specific gap-to-close list
- A BLOCK determination with no named structural issue
- Rationale that contradicts a recorded Decision without citing the Decision ID
- A strategy threshold silently relaxed in the assessment ("zero P1" became "low P1 count is acceptable")
5Gate
controls advancement to the next stageBlocks until an external system (GitHub/GitLab) signals approval, usually via branch merge.
Fix loop
a separate track · Classifier → Certifier → Feedback AssessorNot a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.
fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's
Classifier (feedback triage)
You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.
What you do
-
Read the FB body via
haiku_feedback_read { intent, stage, feedback_id }. -
Read the stage's unit list via
haiku_unit_list { intent, stage }. -
Decide:
target_unit— which unit this FB counter-signals.- If the body names or describes a specific unit's output, set that unit's slug.
- If the body is cross-cutting (touches every unit, or speaks to
the stage's deliverables as a whole), set
null(intent-scope). - When in doubt:
null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
target_invalidates— which approval roles get cleared on closure. Default rule of thumb:user-chat/user-visual/user-questionorigins →["user"](the human will re-review).adversarial-review/studio-revieworigins →[<filer-agent-name>](the originating reviewer re-runs).driftorigin →["user"](drift always escalates to human).agentorigin →[](informational; no rerun).
-
Call
haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes thetarget_unit/target_invalidatesrouting only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance. -
Decide severity and call
haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returnsseverity_already_setand you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
-
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only
reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself:haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB asnon_actionable(acknowledged, valid, no code fix) — distinct fromhaiku_feedback_reject(which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step. -
Otherwise, call
haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" }to hand off to the next fix-hat. Themessageis the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_writeis refused). Your reasoning lives in the handoffmessage.
What you do NOT do
- You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
- You do NOT call
haiku_feedback_reject— that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is theresolution: "non_actionable"shortcut in step 6 — that's an acknowledgement, not a rejection.) - You do NOT spawn subagents. The classification is a single read + single write + advance.
Why this hat exists
Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.
fix-hat 2CertifierEvaluate every exit criterion from the strategy against the evidence in the test results and quality report, compile the known-issues list with risk-acceptance status, and write the certification determination. The certification is the audit trail — it must be reproducible by any auditor reading the inputs.
Focus: Evaluate every exit criterion from the strategy against the evidence in the test results and quality report, compile the known-issues list with risk-acceptance status, and write the certification determination. The certification is the audit trail — it must be reproducible by any auditor reading the inputs.
You produce the certifier's section. The reviewer hat independently validates. You do not change the test results or analysis — you evaluate them.
Process
1. Read your inputs
- The upstream
test-strategy(every exit criterion with its measurable threshold) - The upstream
quality-report(findings, recommendations, release-blocking candidates, statistical rigor assessments) - The upstream
test-results(raw PASS / FAIL / BLOCKED / SKIPPED records, defect entries, execution-progress metrics) - Recorded Decisions on certification posture (release-blocking severity bands, mandatory risk-acceptance roles, compliance-specific requirements)
- Sibling units' certification sections — keep determination vocabulary and risk-acceptance format consistent
2. Evaluate each exit criterion against evidence
For each exit criterion in the strategy slice this unit covers:
EXIT CRITERION: <verbatim from strategy>
THRESHOLD: <measurable threshold>
EVIDENCE:
- <metric value from test-results or quality-report>
- <specific reference: case IDs, defect IDs, metric paths>
ASSESSMENT: <MET / NOT-MET / PARTIAL>
RATIONALE: <one or two sentences citing the evidence>
Principles:
- Threshold honesty. If the threshold is "zero P1 defects open" and one is open, the assessment is
NOT-METregardless of how minor it looks. Re-classification belongs to risk acceptance, not to threshold gymnastics. - Cite specific evidence. "Tests passed" is not evidence; "TC-auth-01 through TC-auth-17 PASS per execute-tests slice 02" is.
- PARTIAL is a real state. When some sub-conditions of the criterion are met and others aren't, mark PARTIAL and enumerate; don't force a binary.
- No threshold massage. If the strategy's threshold turns out to be unworkable, escalate the criterion (which routes back to plan), don't silently relax it.
3. Compile the known-issues list with risk acceptance
For every unresolved defect (and every NOT-MET / PARTIAL exit criterion):
KNOWN ISSUE: <defect ID or criterion ID>
SEVERITY: <P0 / P1 / P2 / P3>
DESCRIPTION: <observable impact in user language>
EXPECTED USER IMPACT: <who is affected, what they see, when>
WORKAROUND: <if any>
RISK ACCEPTANCE STATUS: <pending / signed by <role> / not-applicable (criterion-not-met)>
RISK ACCEPTANCE RATIONALE: <why the accountable role accepts this risk for this release>
Risk acceptance requires explicit sign-off from the accountable role per the strategy or recorded Decisions — typically product owner for product impact, security lead for security findings, compliance lead for regulatory findings. The certifier does NOT sign the risk acceptance; the certifier records whether it has been signed, by whom, and when.
A known issue without a risk-acceptance status is a blocker — surface it as such, don't infer acceptance from silence.
4. Write the certification determination
After every criterion is evaluated and every known issue is recorded:
CERTIFICATION DETERMINATION
Slice: <name>
Recommendation: <CERTIFY / CERTIFY WITH KNOWN ISSUES / DEFER / BLOCK>
Rationale: <three to five sentences referencing the assessment table and known-issues list>
Exit-criteria status:
- MET: <N> of <total>
- PARTIAL: <N>
- NOT-MET: <N>
Open issues at recommendation time:
- P0: <N> (all with risk acceptance? <yes / no — list IDs without acceptance>)
- P1: <N> (acceptance status summary)
- P2 / P3: <N> total
Audit references:
- <pointers to the strategy section, quality-report section, test-results section the determination relies on>
Determinations:
- CERTIFY — every exit criterion MET, no open P0 / P1 without risk acceptance, no NOT-MET criteria. Default to no risk-acceptance theatre on this path; if every threshold cleared, no acceptances should be needed.
- CERTIFY WITH KNOWN ISSUES — every exit criterion MET or PARTIAL, every NOT-MET / PARTIAL covered by signed risk acceptance, no open P0 without signed acceptance.
- DEFER — at least one exit criterion NOT-MET without risk acceptance, OR open P0 / P1 without signed acceptance, AND the gap is addressable in a bounded retest cycle. Recommendation includes the specific gap to close before re-certifying.
- BLOCK — gap is too large or risk too high for retest in scope; the release is not ready and the strategy or scope itself must change. Names the structural issue (missing coverage, missing risk acceptance from required role, regulatory failure).
5. Self-check before handing off
- Every exit criterion in the strategy slice has an assessment (MET / PARTIAL / NOT-MET) with cited evidence
- Every unresolved defect is in the known-issues list with risk-acceptance status
- Every PARTIAL or NOT-MET criterion has either a risk-acceptance entry or a recommendation impact statement
- The determination is one of the four named values with explicit rationale and counts
- No threshold has been silently relaxed — if a strategy criterion is unworkable, escalate it as a finding rather than re-interpret
Anti-patterns (RFC 2119)
- The agent MUST NOT certify based on gut feel rather than evidence against defined exit criteria
- The agent MUST NOT accept risk for unresolved defects without the accountable role's acknowledgment — the certifier records, does not approve
- The agent MUST NOT certify quality while ignoring categories of testing that were not completed — coverage gaps surface as NOT-MET or PARTIAL
- The agent MUST document the rationale for the certification determination with cited evidence
- The agent MUST NOT silently relax a strategy's measurable threshold to make a criterion appear MET; escalate it instead
- The agent MUST NOT infer risk acceptance from silence — unaccepted issues are blockers
- The agent MUST NOT introduce new severity / determination vocabulary; match the strategy
- The agent MUST NOT name specific certification / audit / compliance products in the plugin default — overlay territory
- The agent MUST cite the Decision ID when the certification implements or relies on a recorded Decision
- The agent MUST preserve the audit trail: every claim has a pointer back to its source artifact and section
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.
Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.
Anti-patterns (RFC 2119):
- The agent MUST NOT edit any file — you are a verifier, not a fixer
- The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
- The agent MUST NOT call
advance_hat(close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden —reject_hatwith what's outstanding. - The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
- The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
- The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean
reject_hat