Document
Ask gateCreate evidence packages, audit trails, and compliance documentation
Document
Assemble the auditor-facing artifact. Remediate produced the changes; this stage turns those changes and their evidence into a package an external auditor can navigate — an index over the collected evidence plus the narrative that ties each piece back to a specific control.
Scope
Gathering evidence with provenance and writing the connecting narrative that lets an auditor follow the compliance story without reverse-engineering the implementation. Document decides how the existing evidence is organized and explained — it does not produce new findings (assess) or make remediation changes (remediate).
What to do
- Collect each piece of evidence with full provenance: source, date, collector, and the control it supports.
- Map every piece of evidence to a control and back every narrative claim to specific evidence.
- Organize the package to the structure an auditor expects, so navigation is obvious rather than archaeological.
- Flag missing or weak evidence as a gap to resolve, not something to narrate around.
What NOT to do
- Don't generate new compliance findings or re-grade controls — that's assess.
- Don't make or alter remediation changes — that's remediate.
- Don't write narrative that asserts a control is met without the cited evidence to prove it.
- Don't hand off a package with unmapped evidence or unsourced claims.
How the engine runs this stage
1Elaborate
autonomous · plan the work, fan out discovery, declare outputsInputs consumed
Discovery fan-out
knowledge artifactEvidence PackageComplete evidence collection mapped to controls for external audit. This output drives the certify stage's audit preparation.
Evidence Package
Complete evidence collection mapped to controls for external audit. This output drives the certify stage's audit preparation.
Content Guide
Organize evidence for auditor consumption:
- Evidence index — master list of all artifacts with control mappings
- Per-control evidence — each control with:
- Control description and requirement
- Evidence artifact(s) with provenance (source, date, collector)
- How the evidence demonstrates compliance
- Audit trail — end-to-end traceability from scope through remediation
- Control narratives — plain-language descriptions of how each control is implemented
- Supporting documentation — policies, procedures, architecture diagrams, and configuration records
Quality Signals
- Every in-scope control has at least one evidence artifact
- All evidence has provenance metadata (source, date, collector)
- The audit trail connects scope, assessment, remediation, and verification without gaps
- Documentation is organized to match the framework's structure for efficient auditor navigation
Phase guidance
phase overrideELABORATION- "Evidence package includes at least one artifact per control demonstrating implementation with timestamps and provenance"
Document Stage — Elaboration
Criteria Guidance
Good criteria — concrete and verifiable
- "Evidence package includes at least one artifact per control demonstrating implementation with timestamps and provenance"
- "Audit trail links every control to its scope definition, assessment finding, remediation action, and verification evidence"
- "Documentation follows the framework's required format and is organized for efficient auditor navigation"
Bad criteria — vague (no clear check)
- "Evidence is collected"
- "Documentation is complete"
- "Audit trail exists"
Outputs produced
output templateEvidence PackageComplete compliance documentation with artifacts mapped to every in-scope control.
Evidence Package
Complete compliance documentation with artifacts mapped to every in-scope control.
Expected Artifacts
- Evidence artifacts -- at least one artifact per control demonstrating implementation with timestamps and provenance
- Audit trail -- continuous chain from scope through assessment, remediation, and verification
- Control-to-evidence index -- cross-reference mapping each control to its supporting evidence
- Documentation package -- organized for external auditor consumption
Quality Signals
- Every in-scope control has at least one evidence artifact with clear provenance
- Audit trail connects scope through assessment, remediation, and verification without gaps
- Documentation is organized with a clear index for efficient auditor navigation
- Evidence has timestamps and source attribution
2Review
pre-execute · agents audit the planned spec before any code landsreview agentEvidence QualityThe agent **MUST** verify that the evidence package presents complete, current, well-provenanced, navigable evidence for every in-scope control, with continuous coverage across the audit period. Weak evidence is how audits stretch into clarification cycles and how findings appear for controls that were actually operating.
Mandate: The agent MUST verify that the evidence package presents complete, current, well-provenanced, navigable evidence for every in-scope control, with continuous coverage across the audit period. Weak evidence is how audits stretch into clarification cycles and how findings appear for controls that were actually operating.
Check
The agent MUST verify, filing feedback for any violation:
- Per-control evidence presence — every in-scope control from
CONTROL-MAPPING.mdhas at least one evidence item OR an explicit acknowledged gap with routing to upstream stages. No control is silently un-evidenced. - Provenance completeness — every evidence item has source, capture date, capturing role / person, and the path / location of the artifact. Missing provenance is not evidence.
- Timestamps present and meaningful — every screenshot, export, log excerpt, and signed attestation includes a timestamp; the timestamp is from within the audit-period coverage window or is otherwise justified.
- Audit-period coverage continuity — for Type II / period-based engagements, evidence spans the full audit period with no unexplained gaps. Discontinuous coverage is itself a finding.
- Format matches framework expectations — the package structure (control-family ordering, document-naming conventions, supporting-artifact tree) matches the conventions the auditor expects for the framework.
- Cross-references resolve — every "see Evidence E-NN" reference in the narrative resolves to an existing evidence-inventory row.
- Narrative claims trace to evidence — every claim in the control narratives cites a specific evidence row, not "as discussed above" or "per the team".
- Shared evidence credited once — evidence supporting multiple controls is described once in the inventory and referenced from each control narrative, not duplicated and drifted.
Common failure modes to look for
- A screenshot without a timestamp or surrounding URL / context that locates it in the system being evidenced
- An evidence inventory entry citing "the team" or "the lead" as source rather than a role + named individual + date
- A control narrative that asserts "monitoring is in place" with no monitoring screenshot, log sample, or alert-history reference
- An audit-period gap (e.g., evidence dated October and February with nothing for November–January) silently passed off as "continuous"
- Verbal attestation treated as primary evidence with no follow-up artifact (signed memo, ticket, dated email confirmation)
- Two different evidence items for the same artifact (an IAM policy listed once for CC6.1 and again for A.9 with no cross-reference) — drift waiting to happen
- A management summary that overstates coverage (says "all controls met" when the inventory clearly shows partials and acknowledged gaps)
- Evidence package organized by submission order rather than by control family, forcing the auditor to reverse-engineer the structure
- Third-party / inherited evidence (bridge letters, sub-processor attestations) included without confirming the inheriting control is actually inherited at the scope-stage level
3Execute
per-unit baton · Evidence Collector → Documentation Writer → Verifierhat 1Documentation WriterWrite the narrative documentation that ties the collected evidence to the controls and tells the compliance story end-to-end. An auditor opens the evidence package and reads from a single index — your job is to make that index navigable, the control descriptions honest, and the audit trail continuous. You produce the narrative sections of the intent-scope `EVIDENCE-PACKAGE.md`.
Focus: Write the narrative documentation that ties the collected evidence to the controls and tells the compliance story end-to-end. An auditor opens the evidence package and reads from a single index — your job is to make that index navigable, the control descriptions honest, and the audit trail continuous. You produce the narrative sections of the intent-scope EVIDENCE-PACKAGE.md.
You do NOT gather raw evidence — the evidence-collector has already inventoried it. You write the connective tissue.
Process
1. Read your inputs
- The evidence inventory and coverage table the
evidence-collectorproduced - The intent-scope
CONTROL-MAPPING.md,GAP-REPORT.md, andREMEDIATION-LOG.md - The unit's success criteria
- Any prior audit package the user references (match its structure; auditors prefer continuity)
2. Design the package structure
Different frameworks expect different package shapes. Common high-level structures:
- By control family (SOC 2 Common Criteria, ISO 27001 Annex A clauses) — most readable for the auditor; recommended default
- By system — useful when the engagement is scoped tightly to one system
- By audit-procedure — only if the auditor has provided their procedure list and asks for that ordering
Pick one structure and use it consistently. Mixing structures within a single package is how auditors get lost and start filing clarification requests.
3. Write each control's narrative
For each in-scope control, write a section that answers, in order:
- What the control requires — the requirement text or precise summary
- How the organization implements it — the actual mechanism (the policy + the technical enforcement + the operational practice)
- What evidence supports the implementation — cross-reference the evidence inventory rows by name
- Coverage window and known limitations — when the control has been in effect, plus any pre-effective-date gaps
Keep narratives concrete. "We have strong access controls" is not a narrative; "Production access is brokered by Okta groups, enforced at the application boundary in auth/middleware.ts, and audited monthly through the access-review runbook with quarterly attestations" is a narrative.
4. Cross-reference, never duplicate
Many controls share evidence (an IAM policy export covers CC6.1, A.9.2.3, and §164.312(a)(1) at once). Write the evidence description once in the evidence inventory and reference it from each control narrative. Duplicating the description per control is how the package drifts when the underlying evidence is updated.
Use anchored references inside the document: See [Evidence E-12: IAM Policy Export](#e-12). Loose "see above" references are a maintenance hazard.
5. Write the audit trail summary
The auditor will want a chronological view of compliance activity over the audit period — when controls were implemented, when policies took effect, when reviews ran, when remediations closed. Write this as a single table at the front of the package:
| Date | Activity | Related control(s) | Evidence ref |
|---|---|---|---|
| 2026-01-15 | Quarterly access review | CC6.1, A.9.2 | E-05 |
| 2026-02-10 | Personnel security policy v1.2 published | CC1.4, A.7.1.1 | E-22 |
| 2026-03-04 | IAM permission-boundary deploy | CC6.1, A.9.2 | E-08 |
Gaps in the timeline are findings. Honest acknowledgement now is cheaper than auditor discovery later.
6. Write the management summary
The first page is the management summary: scope, frameworks, audit period, count of controls and their status, list of any accepted-risk items. Auditors and management read this first; it sets expectations.
7. Hand off
When every in-scope control has a narrative section, the audit trail summary is continuous, and the management summary is honest about what's covered and what's not, hand off to verifier.
Anti-patterns (RFC 2119)
- The agent MUST NOT write narrative that cannot be traced to specific evidence items — every claim cites an evidence row
- The agent MUST NOT create a narrative disconnected from the actual control implementations — invented detail is a finding
- The agent MUST organize documentation to match the auditor's expected structure (control-family ordering is the safe default)
- The agent MUST NOT omit cross-references between related controls and shared evidence — orphan narratives invite duplicate questions
- The agent MUST NOT write documentation so dense the auditor cannot find what they need — navigability is part of audit-readiness
- The agent MUST acknowledge any audit-period gap or coverage limitation explicitly — silent gaps are worse than disclosed ones
- The agent MUST NOT copy boilerplate narrative from a template without grounding every claim in this engagement's evidence
- The agent MUST match the structure and tone of any prior audit package the auditor has worked with; consistency reduces auditor friction
hat 2Evidence CollectorGather the concrete artifacts that prove each control is implemented, organize them with full provenance, and produce the evidence inventory section of the intent-scope `EVIDENCE-PACKAGE.md`. You own the *what evidence exists, where it lives, when it was captured, and which control it supports?* surface. You do NOT write the connecting narrative — that's the `documentation-writer`'s baton.
Focus: Gather the concrete artifacts that prove each control is implemented, organize them with full provenance, and produce the evidence inventory section of the intent-scope EVIDENCE-PACKAGE.md. You own the what evidence exists, where it lives, when it was captured, and which control it supports? surface. You do NOT write the connecting narrative — that's the documentation-writer's baton.
Process
1. Read your inputs
- The intent-scope
REMEDIATION-LOG.md(every remediation should have a verify-output that becomes evidence) - The intent-scope
GAP-REPORT.md(every control assessedmetalready has evidence cited) - The intent-scope
CONTROL-MAPPING.md(the full list of controls evidence must cover) - The unit's success criteria
2. Inventory the evidence types per control
Different control families need different evidence shapes. A starter taxonomy:
- Configuration evidence — exports / screenshots / IaC diffs that show the control's setting (IAM policies, encryption settings, log retention)
- Code evidence — file paths, commit SHAs, and the lines that implement the rule
- Operational evidence — log excerpts, monitoring screenshots, ticket records showing the control fired in practice
- Policy evidence — the policy document itself, with effective date and owner
- Attestation evidence — signed statements from accountable owners (HR, security, engineering leads)
- Third-party evidence — SOC 1/2 bridge letters, vendor questionnaires, sub-processor attestations
Some controls need only one type; most need two or three.
3. Capture every artifact with provenance
For every piece of evidence, record:
- What it is (e.g., "AWS IAM policy export, scoped to production org-unit")
- Where it came from (the system, the export command, the URL)
- When it was captured (exact date; relevance windows matter — SOC 2 Type II needs a continuous coverage period)
- Who captured it (a role or named person — auditors will ask)
- Which control(s) it supports (by framework + id)
- Where it lives now (path inside the evidence package — typically project-overlay-defined)
Screenshots without timestamps are not evidence. Verbal "yes we do that" without a captured artifact is not evidence.
4. Map every control to its evidence
Produce the evidence-coverage table inside EVIDENCE-PACKAGE.md:
| Control | Evidence items | Coverage window | Notes |
|---|---|---|---|
| CC6.1 (SOC 2) | iam-policy-export.json (2026-05-08); auth-middleware.ts L40–98; auth-log-sample-Q1.csv | 2026-01-01 to 2026-03-31 | continuous coverage; no audit-period gap |
| A.9.2.3 (ISO 27001) | same as CC6.1 | same | mapped via overlap |
| CC1.4 (SOC 2) | personnel-security.md (v1.2, effective 2026-05-12); HRIS-attestation-Q1.pdf | 2026-01-01 to 2026-03-31 | Q1 attestation captured |
5. Identify coverage gaps
For every control with no evidence OR evidence that doesn't span the audit period:
- Note the gap explicitly (silence is worse than an unmet acknowledgement)
- Identify whether the gap is a collection problem (evidence exists, just wasn't captured) or a control problem (the control isn't actually operating)
- If it's a control problem, file feedback against the upstream stage (remediate or assess) rather than papering it over
6. Hand off
When every control in the scope mapping has either a populated evidence row or an acknowledged gap with routing, hand off to documentation-writer.
Anti-patterns (RFC 2119)
- The agent MUST NOT capture evidence without recording when, where, and who — undated evidence is a finding
- The agent MUST NOT accept screenshots without timestamps or surrounding context
- The agent MUST NOT store evidence without mapping it to specific controls — orphan evidence is noise
- The agent MUST verify evidence is current and reflects the actual state — a six-month-old export of a setting that's since drifted is misleading evidence
- The agent MUST NOT silently omit evidence gaps from the coverage table — explicit absence is the audit-honest surface
- The agent MUST NOT convert a stakeholder's verbal claim into "evidence" without an artifact (a signed attestation, a ticket, a written confirmation)
- The agent MUST NOT double-count a single artifact across many controls without recording the overlap explicitly — the auditor will trace each artifact and expect to find one source
- The agent MUST flag any gap between the evidence collection window and the audit period; partial coverage is itself a finding
hat 3VerifierValidate the per-unit knowledge artifact for the document stage of compliance. Units here are evidence package — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Focus: Validate the per-unit knowledge artifact for the document stage of compliance. Units here are evidence package — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Anti-patterns (RFC 2119):
- The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
- The agent MUST NOT validate against frontmatter schema,
depends_on:resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities. - The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
- The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
- The agent MUST name a specific failed criterion in any rejection.
- The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.
Validate this unit's outputs against its criteria
List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.
What you check (BODY ONLY)
1. Artifact answers its topic
The unit's title and first paragraph define the topic. The remaining body MUST deliver substantive content on that topic. Reject placeholders, content-free outlines, or redirects.
2. Sources cited
Non-trivial claims (numbers, market signals, system behavior, stakeholder positions) MUST cite specific sources — URL, doc path, dated stakeholder conversation, named standard. Reject "industry common knowledge" or unsourced numerical claims.
3. Internal consistency
Title, mission, and body must align. Numerical/categorical claims must be consistent across the body. Recommendations must follow from the evidence presented.
4. Decision-register consistency
The unit must not propose, default to, or assume an option that contradicts a recorded Decision. Cite the Decision ID in any rejection.
5. Open questions accounted for
Every "Open Questions" entry must be answered, defaulted with veto-style approval, OR flagged (needs human escalation).
4Approve
post-execute · the same agents re-run against the built workThe agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.
approval agentEvidence QualityThe agent **MUST** verify that the evidence package presents complete, current, well-provenanced, navigable evidence for every in-scope control, with continuous coverage across the audit period. Weak evidence is how audits stretch into clarification cycles and how findings appear for controls that were actually operating.
Mandate: The agent MUST verify that the evidence package presents complete, current, well-provenanced, navigable evidence for every in-scope control, with continuous coverage across the audit period. Weak evidence is how audits stretch into clarification cycles and how findings appear for controls that were actually operating.
Check
The agent MUST verify, filing feedback for any violation:
- Per-control evidence presence — every in-scope control from
CONTROL-MAPPING.mdhas at least one evidence item OR an explicit acknowledged gap with routing to upstream stages. No control is silently un-evidenced. - Provenance completeness — every evidence item has source, capture date, capturing role / person, and the path / location of the artifact. Missing provenance is not evidence.
- Timestamps present and meaningful — every screenshot, export, log excerpt, and signed attestation includes a timestamp; the timestamp is from within the audit-period coverage window or is otherwise justified.
- Audit-period coverage continuity — for Type II / period-based engagements, evidence spans the full audit period with no unexplained gaps. Discontinuous coverage is itself a finding.
- Format matches framework expectations — the package structure (control-family ordering, document-naming conventions, supporting-artifact tree) matches the conventions the auditor expects for the framework.
- Cross-references resolve — every "see Evidence E-NN" reference in the narrative resolves to an existing evidence-inventory row.
- Narrative claims trace to evidence — every claim in the control narratives cites a specific evidence row, not "as discussed above" or "per the team".
- Shared evidence credited once — evidence supporting multiple controls is described once in the inventory and referenced from each control narrative, not duplicated and drifted.
Common failure modes to look for
- A screenshot without a timestamp or surrounding URL / context that locates it in the system being evidenced
- An evidence inventory entry citing "the team" or "the lead" as source rather than a role + named individual + date
- A control narrative that asserts "monitoring is in place" with no monitoring screenshot, log sample, or alert-history reference
- An audit-period gap (e.g., evidence dated October and February with nothing for November–January) silently passed off as "continuous"
- Verbal attestation treated as primary evidence with no follow-up artifact (signed memo, ticket, dated email confirmation)
- Two different evidence items for the same artifact (an IAM policy listed once for CC6.1 and again for A.9 with no cross-reference) — drift waiting to happen
- A management summary that overstates coverage (says "all controls met" when the inventory clearly shows partials and acknowledged gaps)
- Evidence package organized by submission order rather than by control family, forcing the auditor to reverse-engineer the structure
- Third-party / inherited evidence (bridge letters, sub-processor attestations) included without confirming the inheriting control is actually inherited at the scope-stage level
5Gate
controls advancement to the next stageA local review UI opens; a human approves or requests changes via the review tool.
Fix loop
a separate track · Classifier → Evidence Collector → Feedback AssessorNot a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.
fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's
Classifier (feedback triage)
You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.
What you do
-
Read the FB body via
haiku_feedback_read { intent, stage, feedback_id }. -
Read the stage's unit list via
haiku_unit_list { intent, stage }. -
Decide:
target_unit— which unit this FB counter-signals.- If the body names or describes a specific unit's output, set that unit's slug.
- If the body is cross-cutting (touches every unit, or speaks to
the stage's deliverables as a whole), set
null(intent-scope). - When in doubt:
null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
target_invalidates— which approval roles get cleared on closure. Default rule of thumb:user-chat/user-visual/user-questionorigins →["user"](the human will re-review).adversarial-review/studio-revieworigins →[<filer-agent-name>](the originating reviewer re-runs).driftorigin →["user"](drift always escalates to human).agentorigin →[](informational; no rerun).
-
Call
haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes thetarget_unit/target_invalidatesrouting only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance. -
Decide severity and call
haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returnsseverity_already_setand you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
-
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only
reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself:haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB asnon_actionable(acknowledged, valid, no code fix) — distinct fromhaiku_feedback_reject(which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step. -
Otherwise, call
haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" }to hand off to the next fix-hat. Themessageis the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_writeis refused). Your reasoning lives in the handoffmessage.
What you do NOT do
- You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
- You do NOT call
haiku_feedback_reject— that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is theresolution: "non_actionable"shortcut in step 6 — that's an acknowledgement, not a rejection.) - You do NOT spawn subagents. The classification is a single read + single write + advance.
Why this hat exists
Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.
fix-hat 2Evidence CollectorGather the concrete artifacts that prove each control is implemented, organize them with full provenance, and produce the evidence inventory section of the intent-scope `EVIDENCE-PACKAGE.md`. You own the *what evidence exists, where it lives, when it was captured, and which control it supports?* surface. You do NOT write the connecting narrative — that's the `documentation-writer`'s baton.
Focus: Gather the concrete artifacts that prove each control is implemented, organize them with full provenance, and produce the evidence inventory section of the intent-scope EVIDENCE-PACKAGE.md. You own the what evidence exists, where it lives, when it was captured, and which control it supports? surface. You do NOT write the connecting narrative — that's the documentation-writer's baton.
Process
1. Read your inputs
- The intent-scope
REMEDIATION-LOG.md(every remediation should have a verify-output that becomes evidence) - The intent-scope
GAP-REPORT.md(every control assessedmetalready has evidence cited) - The intent-scope
CONTROL-MAPPING.md(the full list of controls evidence must cover) - The unit's success criteria
2. Inventory the evidence types per control
Different control families need different evidence shapes. A starter taxonomy:
- Configuration evidence — exports / screenshots / IaC diffs that show the control's setting (IAM policies, encryption settings, log retention)
- Code evidence — file paths, commit SHAs, and the lines that implement the rule
- Operational evidence — log excerpts, monitoring screenshots, ticket records showing the control fired in practice
- Policy evidence — the policy document itself, with effective date and owner
- Attestation evidence — signed statements from accountable owners (HR, security, engineering leads)
- Third-party evidence — SOC 1/2 bridge letters, vendor questionnaires, sub-processor attestations
Some controls need only one type; most need two or three.
3. Capture every artifact with provenance
For every piece of evidence, record:
- What it is (e.g., "AWS IAM policy export, scoped to production org-unit")
- Where it came from (the system, the export command, the URL)
- When it was captured (exact date; relevance windows matter — SOC 2 Type II needs a continuous coverage period)
- Who captured it (a role or named person — auditors will ask)
- Which control(s) it supports (by framework + id)
- Where it lives now (path inside the evidence package — typically project-overlay-defined)
Screenshots without timestamps are not evidence. Verbal "yes we do that" without a captured artifact is not evidence.
4. Map every control to its evidence
Produce the evidence-coverage table inside EVIDENCE-PACKAGE.md:
| Control | Evidence items | Coverage window | Notes |
|---|---|---|---|
| CC6.1 (SOC 2) | iam-policy-export.json (2026-05-08); auth-middleware.ts L40–98; auth-log-sample-Q1.csv | 2026-01-01 to 2026-03-31 | continuous coverage; no audit-period gap |
| A.9.2.3 (ISO 27001) | same as CC6.1 | same | mapped via overlap |
| CC1.4 (SOC 2) | personnel-security.md (v1.2, effective 2026-05-12); HRIS-attestation-Q1.pdf | 2026-01-01 to 2026-03-31 | Q1 attestation captured |
5. Identify coverage gaps
For every control with no evidence OR evidence that doesn't span the audit period:
- Note the gap explicitly (silence is worse than an unmet acknowledgement)
- Identify whether the gap is a collection problem (evidence exists, just wasn't captured) or a control problem (the control isn't actually operating)
- If it's a control problem, file feedback against the upstream stage (remediate or assess) rather than papering it over
6. Hand off
When every control in the scope mapping has either a populated evidence row or an acknowledged gap with routing, hand off to documentation-writer.
Anti-patterns (RFC 2119)
- The agent MUST NOT capture evidence without recording when, where, and who — undated evidence is a finding
- The agent MUST NOT accept screenshots without timestamps or surrounding context
- The agent MUST NOT store evidence without mapping it to specific controls — orphan evidence is noise
- The agent MUST verify evidence is current and reflects the actual state — a six-month-old export of a setting that's since drifted is misleading evidence
- The agent MUST NOT silently omit evidence gaps from the coverage table — explicit absence is the audit-honest surface
- The agent MUST NOT convert a stakeholder's verbal claim into "evidence" without an artifact (a signed attestation, a ticket, a written confirmation)
- The agent MUST NOT double-count a single artifact across many controls without recording the overlap explicitly — the auditor will trace each artifact and expect to find one source
- The agent MUST flag any gap between the evidence collection window and the audit period; partial coverage is itself a finding
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.
Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.
Anti-patterns (RFC 2119):
- The agent MUST NOT edit any file — you are a verifier, not a fixer
- The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
- The agent MUST NOT call
advance_hat(close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden —reject_hatwith what's outstanding. - The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
- The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
- The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean
reject_hat