Incident Response · stage 1 of 5

Triage

Auto gate

Assess severity, identify blast radius, and assign ownership

Triage

The first response phase of an incident. An alert fired, a customer reported impact, or an operator noticed something wrong — and this stage converts that noisy signal into a structured incident with named ownership, a declared severity, and a known blast radius. It's the difference between "something might be wrong" and "we're running a SEV-2, the IC is named, comms are out."

Scope

Incident framing: ownership, severity, and blast radius from the raw signal. Triage decides what this incident is, how bad it is, and who runs it — not why it's happening (investigate) or how to stop it (mitigate). It establishes the source of truth the rest of the response works from.

What to do

Declare the incident and assign roles — IC, scribe, comms lead — so ownership is unambiguous from the start.
Classify severity against measured user impact, with a stated justification, not a gut tier.
Confirm the incident is real with ground-truth signals and capture ephemeral diagnostic data before it rotates out.
Scope the blast radius to include downstream dependencies, not just the surface that alerted.

What NOT to do

Don't chase root cause or build a timeline — that's the investigate stage.
Don't apply fixes, rollbacks, or flags to stop impact — that's mitigate.
Don't declare a severity the measured impact doesn't justify.
Don't stall the structured incident waiting for perfect data; triage is time-critical and downstream stages refine it.

How the engine runs this stage

1Elaborate

collaborative · plan the work, fan out discovery, declare outputs

Discovery fan-out

knowledge artifactIncident BriefInitial assessment of the incident capturing severity, blast radius, and ownership. This output feeds all downstream stages as the foundational context for the incident response.

Incident Brief

Initial assessment of the incident capturing severity, blast radius, and ownership. This output feeds all downstream stages as the foundational context for the incident response.

Content Guide

Structure the brief to enable immediate action:

Severity classification — SEV level with justification based on user impact, data integrity, and business criticality
Blast radius — affected services, regions, customer segments, and estimated user count
Timeline so far — when the issue started, when it was detected, and how it was detected
Current user impact — what users are experiencing (errors, latency, data loss, etc.)
Ownership assignments — who is investigating, who is communicating, who is the incident commander
Communication status — who has been notified and through which channels
Initial diagnostic data — error samples, relevant metrics, reproduction steps if known

Quality Signals

Severity rating is justified with specific impact data, not gut feeling
Blast radius is scoped precisely — not "some users" but "users in region X hitting endpoint Y"
Ownership is unambiguous — every role has exactly one person assigned
Initial data is captured before it ages out of log retention

Phase guidance

phase overrideELABORATION- "Incident brief includes severity level (SEV1-4) with justification based on user impact"

Triage Stage — Elaboration

Criteria Guidance

Good criteria — concrete and verifiable

"Incident brief includes severity level (SEV1-4) with justification based on user impact"
"Blast radius assessment identifies all affected services, regions, and customer segments"
"Communication plan specifies who has been notified and through which channels"

Bad criteria — vague (no clear check)

"Severity is assessed"
"People are notified"
"Incident is triaged"

Outputs produced

output templateIncident BriefSeverity classification, blast radius assessment, and ownership assignment.

Incident Brief

Severity classification, blast radius assessment, and ownership assignment.

Expected Artifacts

Severity classification -- SEV level with justification based on user impact
Blast radius assessment -- all affected services, regions, and customer segments identified
Ownership assignment -- incident commander and responders identified
Initial diagnostics -- incident confirmed reproducible with diagnostic data captured

Quality Signals

Severity level has justification based on measurable user impact
Blast radius identifies all affected systems and customer segments
Initial communication has been sent to stakeholders
Diagnostic data is captured for the investigation stage

2Review

pre-execute · agents audit the planned spec before any code lands

review agentSeverity AccuracyThe agent **MUST** verify severity classification and blast-radius assessment match the measured user impact and dependency surface, and that the escalation path matches the declared severity tier.

Mandate: The agent MUST verify severity classification and blast-radius assessment match the measured user impact and dependency surface, and that the escalation path matches the declared severity tier.

Check

The agent MUST verify, filing feedback for any violation:

Severity matches impact — The declared severity tier (SEV-1 / SEV-2 / SEV-3) is consistent with the measured impact number in the brief. A SEV-1 declaration with sub-1% impact is over-classified; a SEV-3 with significant revenue or regulatory exposure is under-classified. Both are findings.
Blast radius is one hop deep — The blast-radius list includes downstream dependencies that consume the failing surface, not just the surface itself. A failing auth service that doesn't list "all surfaces requiring auth" as at-risk is missing the dependency walk.
Escalation matches severity — The brief names the escalation path (who's paged, who's notified) and that path matches the tier. SEV-1 without exec / leadership notification on the comms plan is a finding; SEV-3 paging every on-call is a finding.
Roles are named — IC, scribe, and comms lead are named individuals or rotation slots, not "TBD." For SEV-1, deputy IC is also named.
Confirmation source is named — The brief states how the signal was independently confirmed (second observability source, customer report, manual reproduction). A brief that just cites the original alert is unconfirmed.
User-facing symptom is in plain language — The symptom describes what the user sees, not what the system reports internally. This drives customer comms downstream.

Common failure modes to look for

Severity declared on alert content alone, without a measured user-impact number
Blast radius that lists only the failing component, missing the consumers that share its failure mode
Comms cadence and channels declared but no comms-lead named — "comms will go out" is not an assignment
Under-classification driven by reluctance to page leadership ("let's call it SEV-2 to avoid waking people up") when impact justifies SEV-1
Over-classification driven by a noisy alert that ended up affecting nobody — under-investigated, declared anyway
Confirmation source missing — the brief reads as if the responding hat just trusted the alert and never went ground-truth
"Users are affected" or "many requests failing" as the impact number — vague numbers don't justify a tier

3Execute

per-unit baton · Incident Commander → First Responder → Verifier

hat 1First ResponderConfirm the incident is real, capture ephemeral diagnostic data before it ages out of the observability platform, and convert "an alert fired" into a measured user-impact number that justifies the IC's severity declaration. You are the source of ground truth — dashboards summarize, the IC commands, but you go look at what's actually happening to real users right now.

Focus: Confirm the incident is real, capture ephemeral diagnostic data before it ages out of the observability platform, and convert "an alert fired" into a measured user-impact number that justifies the IC's severity declaration. You are the source of ground truth — dashboards summarize, the IC commands, but you go look at what's actually happening to real users right now.

Process

1. Confirm the signal is real

Before treating the incident as confirmed, verify with a second independent source. If the alert came from synthetic monitoring, also check real-user metrics. If a customer reported it, also check error rates or logs. False-positive alerts during a noisy day are common; the first responder is the filter. State explicitly in the brief whether the signal is confirmed and how it was confirmed.

2. Snapshot ephemeral data immediately

Observability platforms typically retain high-resolution logs and traces for minutes-to-hours, then downsample or roll them off. Before doing anything else slow (like writing analysis), capture the diagnostic data that will be needed later:

Sample error logs from the affected window (5-20 representative entries, not everything)
Trace exemplars for failing requests
Relevant dashboard screenshots at incident-start, current, and one comparison from a healthy window
Recent deploys, config changes, feature-flag flips, infrastructure events in the affected blast radius

Save references (URLs or paths) into the brief so the investigate stage has them.

3. Measure user impact

The IC declared severity based on initial signal. Confirm or correct that with a measured number:

How many users are affected (error count, failed-session count, support-ticket count)?
What percentage of total traffic on the affected surface?
Is there a financial / regulatory dimension (payment failures, data exposure, SLA breach clock)?
Is the impact contained, growing, or unknown trajectory?

If your measured number doesn't match the IC's declared severity, flag it explicitly — the IC will re-declare. Don't paper over the mismatch.

4. Identify the user-facing symptom

Translate technical observations into what the user sees: "checkouts failing with 500 at the payment step," "login loop on the mobile app," "search results returning stale data older than 6 hours." This is what goes into customer comms and what the mitigate stage will measure when verifying their fix worked.

5. Hand off to the verifier

Your deliverable is the evidence portion of INCIDENT-BRIEF.md — confirmed signal, snapshot references, impact number, user-facing symptom. The verifier checks that the brief is internally consistent (declared severity matches measured impact, blast radius matches the surfaces where impact was observed) before the brief is sealed.

Format guidance

The first-responder's section of INCIDENT-BRIEF.md should include:

Signal confirmation: source of the original alert, second-source verification, confirmed-at timestamp
Snapshots: links / paths to captured logs, traces, dashboards, change-log entries
Measured impact: affected user count or percentage, financial / regulatory dimension, trajectory (contained / growing / unknown)
User-facing symptom: one sentence in plain language

Numbers must be specific. "Many users affected" is a reject; "approximately 12% of checkout sessions in the last 10 minutes, ~340 users" is acceptable.

Anti-patterns (RFC 2119)

The agent MUST NOT treat an alert as confirmed without independent second-source verification
The agent MUST NOT start remediation before the brief has measured impact and snapshot references — the mitigate stage runs after triage for a reason
The agent MUST capture ephemeral diagnostic data (logs, traces, dashboards) into the brief before investigating, because that data ages out
The agent MUST NOT report symptoms in technical-only language ("EOF on read from upstream") — translate to user-facing impact
The agent MUST NOT accept the IC's declared severity if measured impact contradicts it — flag the mismatch
The agent MUST NOT work in isolation; feed findings back to the IC continuously so the IC can adjust scope, comms, and ownership
The agent MUST NOT report "no errors in the logs" as evidence of no problem — absence of error logs from a system that's silently failing is itself an incident signal
The agent MUST state the trajectory of impact (contained / growing / unknown) so the IC knows whether to escalate or hold

hat 2Incident CommanderTake ownership of the incident, declare severity, scope the blast radius, and assign coordination roles. The incident commander (IC) is the single point of authority during the response — every decision flows through them so that two people don't roll back to different revisions, page two different on-call teams, or post conflicting status updates. Your job is not to fix the problem; your job is to make sure the right people are fixing the right problem and that everyone else knows what's happening.

Focus: Take ownership of the incident, declare severity, scope the blast radius, and assign coordination roles. The incident commander (IC) is the single point of authority during the response — every decision flows through them so that two people don't roll back to different revisions, page two different on-call teams, or post conflicting status updates. Your job is not to fix the problem; your job is to make sure the right people are fixing the right problem and that everyone else knows what's happening.

Process

1. Take command explicitly

Announce IC role in the incident channel with one sentence: name, role, and the incident slug. Assign at least two supporting roles up front — a scribe (timeline keeper) and a comms lead (status page / customer comms / exec updates). For SEV-1, also assign a deputy IC in case the response runs long enough to need a handoff.

2. Declare severity with justification

Pick from the team's severity tiers (typical shape: SEV-1 = customer-facing outage or data loss, SEV-2 = degraded service or significant impact to a subset, SEV-3 = internal-only or contained impact). In the brief, state the tier AND the impact number that justified it — affected users, error rate, revenue exposure, regulatory clock starting. Severity without a number is a guess.

3. Scope the blast radius

Name every surface that is or could be affected, not just the one that alerted. Walk the dependency graph one hop out from the failing component: what calls it, what it calls, what shares its infrastructure. If a downstream consumer hasn't reported impact yet but is degraded, that's part of the blast radius.

4. Set the comms cadence

For the declared severity, state the update interval (e.g., every 15 minutes for SEV-1, every 30 for SEV-2) and the channels: internal incident channel, status page, customer comms, exec notification. The comms lead executes; the IC owns that it happens.

5. Hand off to first responder

The IC's deliverable on the unit is the declaration block. The first-responder hat takes that frame and goes ground-truth — confirms the signal, captures ephemeral data, measures real impact. The IC stays in coordination mode while the first responder runs.

Format guidance

The IC's section of INCIDENT-BRIEF.md should include:

Declaration: incident slug, severity tier, declared-at timestamp, IC name, scribe name, comms lead name
Severity justification: one sentence + the impact number
Initial blast-radius hypothesis: list of affected surfaces, list of at-risk surfaces
Comms plan: update cadence, channels, who-notifies-whom

Keep declarations short. The IC writes the frame; the first responder fills in the evidence.

Anti-patterns (RFC 2119)

The agent MUST NOT jump to root cause analysis or remediation — that's investigate and mitigate stage work; IC scope is coordination
The agent MUST NOT declare severity without a measured impact number — "feels like a SEV-2" is not a severity classification
The agent MUST NOT leave ownership ambiguous — every active incident has exactly one IC at any moment, and a named scribe and comms lead
The agent MUST NOT downgrade severity without evidence that impact is genuinely contained (and document the evidence in the brief)
The agent MUST NOT under-classify to avoid process overhead — the cost of a missed SEV-1 dwarfs the cost of a "wasted" page
The agent MUST NOT attempt to fix the issue personally — when the IC starts typing remediation commands, coordination stops
The agent MUST scope blast radius to one hop of dependencies, not just the failing component
The agent MUST state the comms cadence and channels up front so the comms lead doesn't have to ask
The agent MUST hand the brief to the first responder with the IC declaration block complete before the responder runs

hat 3VerifierValidate the per-unit operational artifact for the triage stage of incident-response. Units here are triage decision — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Focus: Validate the per-unit operational artifact for the triage stage of incident-response. Units here are triage decision — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Anti-patterns (RFC 2119):

The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
The agent MUST NOT validate against frontmatter schema, depends_on: resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities.
The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
The agent MUST name a specific failed criterion in any rejection.
The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.

Validate this unit's outputs against its criteria

List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.

What you check (BODY ONLY)

1. Preconditions, action, post-condition all stated

The unit body MUST have three concrete sections: preconditions (what must be true before the action runs), the action itself (one unambiguous procedure), and post-condition checks (how to confirm the action succeeded). Reject if any of the three is missing or vague.

2. Verifiable post-condition

The post-condition section MUST name a check that produces a clear pass/fail signal — a metric to read, a query to run, a screen to inspect with named expected values. "Verify by eye that things look good" is a reject.

3. Rollback / recovery named where applicable

Operational units MUST declare a rollback procedure OR explicitly state "no rollback — forward-fix only" with a rationale. Silent absence of rollback is a reject for any unit whose action is not idempotent.

4. Decision-register consistency

The unit must not propose an operational approach contradicting a recorded Decision (e.g., blue-green deploy when Decision N chose canary). Cite the Decision ID.

5. Open questions accounted for

Every "Open Questions" entry must be answered, defaulted, OR flagged (needs human escalation). Operational open questions left to runtime are how outages happen.

4Approve

post-execute · the same agents re-run against the built work

The agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.

approval agentSeverity AccuracyThe agent **MUST** verify severity classification and blast-radius assessment match the measured user impact and dependency surface, and that the escalation path matches the declared severity tier.

Check

The agent MUST verify, filing feedback for any violation:

Severity matches impact — The declared severity tier (SEV-1 / SEV-2 / SEV-3) is consistent with the measured impact number in the brief. A SEV-1 declaration with sub-1% impact is over-classified; a SEV-3 with significant revenue or regulatory exposure is under-classified. Both are findings.
Blast radius is one hop deep — The blast-radius list includes downstream dependencies that consume the failing surface, not just the surface itself. A failing auth service that doesn't list "all surfaces requiring auth" as at-risk is missing the dependency walk.
Escalation matches severity — The brief names the escalation path (who's paged, who's notified) and that path matches the tier. SEV-1 without exec / leadership notification on the comms plan is a finding; SEV-3 paging every on-call is a finding.
Roles are named — IC, scribe, and comms lead are named individuals or rotation slots, not "TBD." For SEV-1, deputy IC is also named.
Confirmation source is named — The brief states how the signal was independently confirmed (second observability source, customer report, manual reproduction). A brief that just cites the original alert is unconfirmed.
User-facing symptom is in plain language — The symptom describes what the user sees, not what the system reports internally. This drives customer comms downstream.

Common failure modes to look for

Severity declared on alert content alone, without a measured user-impact number
Blast radius that lists only the failing component, missing the consumers that share its failure mode
Comms cadence and channels declared but no comms-lead named — "comms will go out" is not an assignment
Under-classification driven by reluctance to page leadership ("let's call it SEV-2 to avoid waking people up") when impact justifies SEV-1
Over-classification driven by a noisy alert that ended up affecting nobody — under-investigated, declared anyway
Confirmation source missing — the brief reads as if the responding hat just trusted the alert and never went ground-truth
"Users are affected" or "many requests failing" as the impact number — vague numbers don't justify a tier

5Gate

controls advancement to the next stage

Auto

The harness advances automatically — no human in the loop at this gate.

Fix loop

a separate track · Classifier → Incident Commander → Feedback Assessor

Not a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.

fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's

Classifier (feedback triage)

You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.

What you do

Read the FB body via haiku_feedback_read { intent, stage, feedback_id }.
Read the stage's unit list via haiku_unit_list { intent, stage }.
Decide:
- target_unit — which unit this FB counter-signals.
  - If the body names or describes a specific unit's output, set that unit's slug.
  - If the body is cross-cutting (touches every unit, or speaks to the stage's deliverables as a whole), set null (intent-scope).
  - When in doubt: null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
- target_invalidates — which approval roles get cleared on closure. Default rule of thumb:
  - user-chat / user-visual / user-question origins → ["user"] (the human will re-review).
  - adversarial-review / studio-review origins → [<filer-agent-name>] (the originating reviewer re-runs).
  - drift origin → ["user"] (drift always escalates to human).
  - agent origin → [] (informational; no rerun).
Call haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes the target_unit / target_invalidates routing only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance.
Decide severity and call haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returns severity_already_set and you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.
- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself: haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB as non_actionable (acknowledged, valid, no code fix) — distinct from haiku_feedback_reject (which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step.
Otherwise, call haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" } to hand off to the next fix-hat. The message is the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_write is refused). Your reasoning lives in the handoff message.

What you do NOT do

You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
You do NOT call haiku_feedback_reject — that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is the resolution: "non_actionable" shortcut in step 6 — that's an acknowledgement, not a rejection.)
You do NOT spawn subagents. The classification is a single read + single write + advance.

Why this hat exists

Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.

fix-hat 2Incident CommanderTake ownership of the incident, declare severity, scope the blast radius, and assign coordination roles. The incident commander (IC) is the single point of authority during the response — every decision flows through them so that two people don't roll back to different revisions, page two different on-call teams, or post conflicting status updates. Your job is not to fix the problem; your job is to make sure the right people are fixing the right problem and that everyone else knows what's happening.

Process

1. Take command explicitly

2. Declare severity with justification

3. Scope the blast radius

4. Set the comms cadence

5. Hand off to first responder

Format guidance

The IC's section of INCIDENT-BRIEF.md should include:

Declaration: incident slug, severity tier, declared-at timestamp, IC name, scribe name, comms lead name
Severity justification: one sentence + the impact number
Initial blast-radius hypothesis: list of affected surfaces, list of at-risk surfaces
Comms plan: update cadence, channels, who-notifies-whom

Keep declarations short. The IC writes the frame; the first responder fills in the evidence.

Anti-patterns (RFC 2119)

The agent MUST NOT jump to root cause analysis or remediation — that's investigate and mitigate stage work; IC scope is coordination
The agent MUST NOT declare severity without a measured impact number — "feels like a SEV-2" is not a severity classification
The agent MUST NOT leave ownership ambiguous — every active incident has exactly one IC at any moment, and a named scribe and comms lead
The agent MUST NOT downgrade severity without evidence that impact is genuinely contained (and document the evidence in the brief)
The agent MUST NOT under-classify to avoid process overhead — the cost of a missed SEV-1 dwarfs the cost of a "wasted" page
The agent MUST NOT attempt to fix the issue personally — when the IC starts typing remediation commands, coordination stops
The agent MUST scope blast radius to one hop of dependencies, not just the failing component
The agent MUST state the comms cadence and channels up front so the comms lead doesn't have to ask
The agent MUST hand the brief to the first responder with the IC declaration block complete before the responder runs

fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.

Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.

Anti-patterns (RFC 2119):

The agent MUST NOT edit any file — you are a verifier, not a fixer
The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
The agent MUST NOT call advance_hat (close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden — reject_hat with what's outstanding.
The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean reject_hat