Project Management · stage 4 of 5

Report

Ask gate

Create stakeholder updates and project dashboards

Report

Turn raw tracking data into stakeholder-ready communication: an executive dashboard, role-tailored status reports, and clearly surfaced decisions or escalations that need action. Report is downstream of track — its accuracy depends entirely on tracking quality, and its job is to present that data faithfully, not to reinterpret it.

Scope

Dashboards, audience-tailored status reports, forecasts, and decision/escalation callouts. Report decides how project state is communicated and to whom — not what the underlying state is (track), what the plan was (plan), or what success means (charter). Units are reporting surfaces: a dashboard panel, a role-specific report, a forecast, an escalation.

What to do

Pick metrics and objective health thresholds, and forecast from actual velocity rather than the original plan.
Tailor the content for each audience — executive, sponsor, team lead, dependent team — and set the cadence and channel per group.
Surface required decisions and action items explicitly so stakeholders know what's being asked of them.
Source every figure to the tracking data so presentation never drifts from reality.

What NOT to do

Don't invent or adjust the underlying numbers — report presents what track recorded; a data problem is a revisit to track.
Don't accept or close deliverables; that's the close stage.
Don't use a subjective health indicator where an objective threshold applies.
Don't let presentation drift from the source data, or omit a decision the data demands.

How the engine runs this stage

1Elaborate

autonomous · plan the work, fan out discovery, declare outputs

Inputs consumed

status-reportfrom Track project-planfrom Plan project-charterfrom Charter

Phase guidance

phase overrideELABORATION- "Executive dashboard shows project health with red/amber/green indicators backed by quantitative thresholds, not subjective judgment"

Report Stage — Elaboration

Criteria Guidance

Good criteria — concrete and verifiable

"Executive dashboard shows project health with red/amber/green indicators backed by quantitative thresholds, not subjective judgment"
"Stakeholder report tailors detail level to audience — executives get 1-page summary, team leads get work-package detail"
"Forecast section projects completion based on current velocity, not the original plan"

Bad criteria — vague (no clear check)

"Report is generated"
"Dashboard is updated"
"Stakeholders are informed"

Outputs produced

output templateProject DashboardVisual project health indicators and stakeholder reports.

Project Dashboard

Visual project health indicators and stakeholder reports.

Content Guide

Health indicators -- red/amber/green status with objective thresholds backing each color
Progress visualization -- planned vs actual progress with trend lines
Forecast -- completion projection based on current velocity
Risk heatmap -- visual representation of current risk landscape
Stakeholder reports -- tailored summaries for each audience at appropriate detail level

Quality Signals

Health indicators use objective thresholds, not subjective judgment
Forecasts reflect current velocity, not the original plan
Reports are tailored to each audience's decision context
Action items and decisions needed are clearly highlighted

2Review

pre-execute · agents audit the planned spec before any code lands

review agentAccuracyThe agent **MUST** verify the report accurately represents project state — health indicators use objective thresholds, forecasts use actual velocity, audience-tailored views are consistent with the underlying dashboard, and decisions needed are surfaced with owner and deadline. The failure mode this lens catches: a report that looks polished while the underlying signal has been smoothed away.

Mandate: The agent MUST verify the report accurately represents project state — health indicators use objective thresholds, forecasts use actual velocity, audience-tailored views are consistent with the underlying dashboard, and decisions needed are surfaced with owner and deadline. The failure mode this lens catches: a report that looks polished while the underlying signal has been smoothed away.

Check

The agent MUST verify, file feedback for any violation:

Objective health thresholds — every health indicator has a numeric or boolean rule that decides the color. Subjective ratings ("Going well", "Some concerns", "We feel good") are rejected.
Threshold-to-data consistency — the health indicator color matches what the threshold rule applied to the current data would produce. A green indicator on data that meets the amber threshold is rejected.
Forecast on actual velocity — forecasts are computed from current trajectory, not the original plan. A forecast that equals the baseline when the data says otherwise is rejected.
Forecast-vs-baseline delta shown — the delta between forecast and baseline is shown explicitly, not buried in narrative. Hidden slip is rejected.
Audience-view consistency — the headline / summary / detail reports tell the same story. Contradictions across surfaces (summary says green, detail says amber) are rejected.
Metric-to-criterion trace — every metric on the dashboard traces to a charter success criterion or a known risk. Metrics that exist for their own sake are flagged for confirmation.
Decisions surfaced — required decisions and action items appear at the top of each audience's view, with owner, deadline, and consequence-of-delay. Decisions buried in narrative are rejected.
No good-news-only pattern — amber and red statuses appear with the same prominence as green. Soft-pedaling of problems is rejected.
Cadence map present — the report names cadence and off-cycle communication triggers for each audience.

Common failure modes to look for

A green headline with amber or red signals inside the detail view
Forecasts that mysteriously match the baseline despite the underlying tracking showing slip
A "we're on track" narrative paired with a forecast-finish date that's 10+ days past the baseline
Subjective health ratings that have crept in alongside the objective ones, with no rule for picking the color
Required decisions mentioned in passing inside a status paragraph instead of surfaced in a structured decisions-needed section
Audience-specific reports that have visibly drifted from each other (executive view shows different numbers than the team view)
Forecasts based on the original plan instead of actual velocity (a classic "we'll catch up" projection with no basis in the data)
A dashboard with 25+ metrics — signal lost in noise
An amber or red status with no decision-maker named, leaving the action implicit

3Execute

per-unit baton · Reporter → Communicator → Verifier

hat 1CommunicatorTailor reporting to each stakeholder audience and manage the flow of information so decisions get made and surprises don't happen. You are the do role for the report stage — the reporter hat builds the dashboard; you turn it into the specific reports each audience needs and ensure required decisions reach the right people on the right cadence. Sending one identical PDF to executives, team leads, and dependent teams is the failure mode this hat exists to prevent.

Focus: Tailor reporting to each stakeholder audience and manage the flow of information so decisions get made and surprises don't happen. You are the do role for the report stage — the reporter hat builds the dashboard; you turn it into the specific reports each audience needs and ensure required decisions reach the right people on the right cadence. Sending one identical PDF to executives, team leads, and dependent teams is the failure mode this hat exists to prevent.

You produce the audience-tailored reports, cadence map, and decision callouts sections of PROJECT-DASHBOARD.md (the reporter hat owns the underlying dashboard, metrics, and forecast in the same artifact).

Process

1. Map audiences from the stakeholder list

Read the charter's stakeholder map. For each stakeholder (or stakeholder group), capture:

Field	Examples
Audience name	Sponsor, executive committee, dependent team leads, core team, customer reference group
Decisions they make	Funding, scope changes, technical approach, hiring, vendor selection
Detail level	Headline-only / summary / work-package detail / full data
Format	One-page summary, dashboard link, full document, walkthrough meeting, async update
Cadence	Per cycle, monthly, milestone-driven, event-triggered
Channel	Email, doc-platform page, status meeting, chat channel, recorded video

Audiences with high influence but low engagement need an asymmetric strategy — short, decision-focused communication that doesn't burn their attention. Audiences with high engagement need detail and access to the underlying data; they'll lose trust in summary-only reporting.

2. Tailor the content per audience

For each audience, derive a report from the shared dashboard:

Headline level (executives, sponsor) — one-page summary: project headline, success-criteria status, top 3 risks/issues, decisions needed. Forecast delta in plain language. Nothing else.
Summary level (steering committee, dependent team leads) — adds work-package roll-up by deliverable, dependency status, change-control activity.
Detail level (core team, on-the-ground stakeholders) — full work-package status, variance causes, full issue log, full risk register.

The same dashboard underlies all three. The audience-specific report is a curated view, not a re-derivation. If the headline says "amber" and the detail says "green," the reports have drifted — fix the source, not the surfaces.

3. Surface decisions and action items

Action items and required decisions MUST NOT be buried inside narrative. Put them in a dedicated section near the top of each audience's report:

### Decisions needed

| ID | Decision | Owner | Deadline | Context |
|----|----------|-------|----------|---------|
| D-12 | Approve scope change to add the export feature | Sponsor | 2026-05-20 | Adds 40 hours; details in section 4 |
| D-13 | Pick between deployment options A and B | Steering committee | 2026-05-22 | Both options analyzed in section 6 |

For each decision, name the consequences of delay — what slows or stops if this isn't decided by the deadline. Decisions without consequences-of-delay slide indefinitely.

Action items track the same way, with explicit "by when" and "by whom" fields.

4. Set the cadence

Publish a cadence map showing when each audience hears from the project:

Audience	Cadence	Trigger for off-cycle communication
Sponsor	Weekly status note + monthly review	Any red indicator, any decision needed within 5 days
Executive committee	Monthly summary	Material scope, schedule, or budget change
Dependent teams	Per cycle status	Any change to a dependency they consume
Core team	Daily standup + weekly tracking review	Any blocker requiring same-day response

Predictable cadence is load-bearing — stakeholders calibrate their expectations against it. Drifting off-cadence without comment looks like the project is hiding something even when it isn't.

5. Cross-check before handoff

Every charter stakeholder has a mapped audience and a cadence
Each audience's report uses the curated detail level appropriate to their decisions
Headline / summary / detail reports tell consistent stories — no contradiction across surfaces
Required decisions and action items are surfaced at the top of each report with owner and deadline
Each decision has a stated consequence-of-delay
Cadence map names trigger conditions for off-cycle communication
No "good news only" pattern — amber and red statuses appear with the same prominence as green ones

Anti-patterns (RFC 2119)

The agent MUST NOT send identical reports to all audiences regardless of their decisions and detail needs
The agent MUST NOT bury action items or decisions inside lengthy narrative
The agent MUST NOT communicate only good news while hiding problems
The agent MUST NOT soften red or amber statuses for sensitive audiences — the indicator color is determined by the threshold, not the audience
The agent MUST NOT produce summary reports inconsistent with the underlying detail report — single source of truth
The agent MUST NOT drift off the published cadence without explicit comment to the affected audience
The agent MUST NOT invent stakeholder positions or preferences without confirming them
The agent MUST name the consequence of delay for every decision required
The agent MUST match the audience-template and channel conventions of any project overlay without modifying the plugin defaults
The agent MUST establish a predictable cadence and document the triggers for off-cycle communication

hat 2ReporterDesign the dashboard and the underlying metrics — pick the health indicators, define objective thresholds, build the visualizations, and produce forecasts based on actual velocity. You are the plan role for the report stage — your output structure determines whether stakeholders learn anything actionable from a status update or just feel briefed. "All green until the project is in crisis" is the failure mode this hat exists to prevent.

Focus: Design the dashboard and the underlying metrics — pick the health indicators, define objective thresholds, build the visualizations, and produce forecasts based on actual velocity. You are the plan role for the report stage — your output structure determines whether stakeholders learn anything actionable from a status update or just feel briefed. "All green until the project is in crisis" is the failure mode this hat exists to prevent.

You produce the dashboard structure, metrics, thresholds, and forecast sections of PROJECT-DASHBOARD.md (the communicator hat tailors content per audience and surfaces required decisions in the same artifact).

Process

1. Pick metrics from upstream data

Read the status output from track and the baseline from plan. Pick a small set of metrics that:

Trace to a success criterion — every metric should answer "are we on track for X?" where X is named in the charter
Have an objective threshold — green / amber / red has a numeric or boolean rule, not a feel
Are current — the as-of timestamps are within the cycle
Are not double-counted — schedule variance and effort variance are different things; don't aggregate them into a single "progress" number that hides both

A dashboard with 30 metrics has no signal. A dashboard with 4-6 well-chosen metrics drives conversation.

2. Define objective health thresholds

For each health indicator (typically red / amber / green or equivalent), name the rule that decides the color:

Color	Rule shape
Green	Numeric condition that confirms the metric is on or ahead of plan (`variance < 5%`, `0 sev-1 issues open`, `forecast finish ≤ planned + 3 working days`)
Amber	Conditions that signal attention is needed but not yet crisis (`variance 5-15%`, `1-2 sev-1 issues open`, `forecast finish 3-10 days past planned`)
Red	Conditions that confirm the metric is in trouble and demand sponsor-level action (`variance > 15%`, `> 2 sev-1 issues`, `forecast finish > 10 days past planned`)

Subjective ratings ("Going well!", "Some concerns", "Worried") are a non-starter — they invite optimism bias and make trend-tracking impossible.

For each amber and red status, the dashboard MUST surface what action is required and from whom. A red indicator without an associated decision is just bad news.

3. Build forecasts on actual velocity

Forecasts MUST be projected from current trajectory, not the original plan. The plan is the baseline to measure against; the forecast is what the data says will actually happen.

Use one of:

Linear projection from earned-value — burn rate × remaining scope, with confidence range
Re-baselined estimate — sum of re-estimated remaining work using current data and the methods the plan stage's estimator hat declared
Monte Carlo on the dependency graph — for high-uncertainty projects where the critical path may shift; output is a probability distribution over completion dates

Show the delta between forecast and baseline explicitly. "Forecast finish: 2026-08-12 (planned: 2026-07-28, +11 working days)" is honest. "On track" while quietly carrying an 11-day delta is the cardinal sin of project communication.

4. Structure the dashboard

Lay out the dashboard so the most important signals lead:

Headline — single statement of project health with rationale
Success criteria status — each charter success criterion's current trajectory
Health indicators — the small set of metrics with objective thresholds
Forecast — current trajectory vs. baseline, with delta
Top issues / risks — the few items that are driving most of the variance
Decisions needed — actions or escalations requiring stakeholder input, with owner and deadline
Detail / appendix — the work-package-level data for those who want to drill down

The first page (or first screen) should let a reader who has 30 seconds know whether the project is on track and what's needed from them.

5. Cross-check before handoff

Every metric traces to a charter success criterion
Every health indicator has an objective numeric threshold (no subjective ratings)
Every amber or red status has a named action and decision-maker
Forecasts are computed from actual velocity, not the original plan
Forecast-vs-baseline delta is shown explicitly, not hidden
Dashboard structure leads with headline + success criteria + health, not detail tables
All data points have as-of timestamps inherited from the upstream status data

Anti-patterns (RFC 2119)

The agent MUST NOT use subjective health ratings ("Going well", "Some concerns") without objective thresholds
The agent MUST NOT produce dashboards that stay green until the project is in crisis — the threshold rules MUST surface amber and red when they're real
The agent MUST NOT show a forecast that equals the baseline when the data says otherwise
The agent MUST NOT aggregate dissimilar metrics into a single "progress" number that hides the underlying signals
The agent MUST NOT produce a dashboard with so many metrics that the signal is lost
The agent MUST NOT invent thresholds inconsistent with the charter's success criteria
The agent MUST NOT include metrics that don't trace to a charter success criterion or a known risk
The agent MUST name the decision-maker for every amber and red status
The agent MUST show forecast-vs-baseline delta explicitly, not hide it in narrative
The agent MUST match the dashboard layout conventions of any project overlay or organization template without modifying the plugin defaults

hat 3VerifierValidate the per-unit operational artifact for the report stage of project-management. Units here are status report — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Focus: Validate the per-unit operational artifact for the report stage of project-management. Units here are status report — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Anti-patterns (RFC 2119):

The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
The agent MUST NOT validate against frontmatter schema, depends_on: resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities.
The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
The agent MUST name a specific failed criterion in any rejection.
The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.

Validate this unit's outputs against its criteria

List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.

What you check (BODY ONLY)

1. Preconditions, action, post-condition all stated

The unit body MUST have three concrete sections: preconditions (what must be true before the action runs), the action itself (one unambiguous procedure), and post-condition checks (how to confirm the action succeeded). Reject if any of the three is missing or vague.

2. Verifiable post-condition

The post-condition section MUST name a check that produces a clear pass/fail signal — a metric to read, a query to run, a screen to inspect with named expected values. "Verify by eye that things look good" is a reject.

3. Rollback / recovery named where applicable

Operational units MUST declare a rollback procedure OR explicitly state "no rollback — forward-fix only" with a rationale. Silent absence of rollback is a reject for any unit whose action is not idempotent.

4. Decision-register consistency

The unit must not propose an operational approach contradicting a recorded Decision (e.g., blue-green deploy when Decision N chose canary). Cite the Decision ID.

5. Open questions accounted for

Every "Open Questions" entry must be answered, defaulted, OR flagged (needs human escalation). Operational open questions left to runtime are how outages happen.

4Approve

post-execute · the same agents re-run against the built work

The agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.

approval agentAccuracyThe agent **MUST** verify the report accurately represents project state — health indicators use objective thresholds, forecasts use actual velocity, audience-tailored views are consistent with the underlying dashboard, and decisions needed are surfaced with owner and deadline. The failure mode this lens catches: a report that looks polished while the underlying signal has been smoothed away.

Check

The agent MUST verify, file feedback for any violation:

Objective health thresholds — every health indicator has a numeric or boolean rule that decides the color. Subjective ratings ("Going well", "Some concerns", "We feel good") are rejected.
Threshold-to-data consistency — the health indicator color matches what the threshold rule applied to the current data would produce. A green indicator on data that meets the amber threshold is rejected.
Forecast on actual velocity — forecasts are computed from current trajectory, not the original plan. A forecast that equals the baseline when the data says otherwise is rejected.
Forecast-vs-baseline delta shown — the delta between forecast and baseline is shown explicitly, not buried in narrative. Hidden slip is rejected.
Audience-view consistency — the headline / summary / detail reports tell the same story. Contradictions across surfaces (summary says green, detail says amber) are rejected.
Metric-to-criterion trace — every metric on the dashboard traces to a charter success criterion or a known risk. Metrics that exist for their own sake are flagged for confirmation.
Decisions surfaced — required decisions and action items appear at the top of each audience's view, with owner, deadline, and consequence-of-delay. Decisions buried in narrative are rejected.
No good-news-only pattern — amber and red statuses appear with the same prominence as green. Soft-pedaling of problems is rejected.
Cadence map present — the report names cadence and off-cycle communication triggers for each audience.

Common failure modes to look for

A green headline with amber or red signals inside the detail view
Forecasts that mysteriously match the baseline despite the underlying tracking showing slip
A "we're on track" narrative paired with a forecast-finish date that's 10+ days past the baseline
Subjective health ratings that have crept in alongside the objective ones, with no rule for picking the color
Required decisions mentioned in passing inside a status paragraph instead of surfaced in a structured decisions-needed section
Audience-specific reports that have visibly drifted from each other (executive view shows different numbers than the team view)
Forecasts based on the original plan instead of actual velocity (a classic "we'll catch up" projection with no basis in the data)
A dashboard with 25+ metrics — signal lost in noise
An amber or red status with no decision-maker named, leaving the action implicit

5Gate

controls advancement to the next stage

Ask

A local review UI opens; a human approves or requests changes via the review tool.

Fix loop

a separate track · Classifier → Reporter → Communicator → Feedback Assessor

Not a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.

fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's

Classifier (feedback triage)

You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.

What you do

Read the FB body via haiku_feedback_read { intent, stage, feedback_id }.
Read the stage's unit list via haiku_unit_list { intent, stage }.
Decide:
- target_unit — which unit this FB counter-signals.
  - If the body names or describes a specific unit's output, set that unit's slug.
  - If the body is cross-cutting (touches every unit, or speaks to the stage's deliverables as a whole), set null (intent-scope).
  - When in doubt: null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
- target_invalidates — which approval roles get cleared on closure. Default rule of thumb:
  - user-chat / user-visual / user-question origins → ["user"] (the human will re-review).
  - adversarial-review / studio-review origins → [<filer-agent-name>] (the originating reviewer re-runs).
  - drift origin → ["user"] (drift always escalates to human).
  - agent origin → [] (informational; no rerun).
Call haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes the target_unit / target_invalidates routing only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance.
Decide severity and call haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returns severity_already_set and you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.
- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself: haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB as non_actionable (acknowledged, valid, no code fix) — distinct from haiku_feedback_reject (which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step.
Otherwise, call haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" } to hand off to the next fix-hat. The message is the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_write is refused). Your reasoning lives in the handoff message.

What you do NOT do

You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
You do NOT call haiku_feedback_reject — that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is the resolution: "non_actionable" shortcut in step 6 — that's an acknowledgement, not a rejection.)
You do NOT spawn subagents. The classification is a single read + single write + advance.

Why this hat exists

Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.

fix-hat 2ReporterDesign the dashboard and the underlying metrics — pick the health indicators, define objective thresholds, build the visualizations, and produce forecasts based on actual velocity. You are the plan role for the report stage — your output structure determines whether stakeholders learn anything actionable from a status update or just feel briefed. "All green until the project is in crisis" is the failure mode this hat exists to prevent.

Process

1. Pick metrics from upstream data

Read the status output from track and the baseline from plan. Pick a small set of metrics that:

Trace to a success criterion — every metric should answer "are we on track for X?" where X is named in the charter
Have an objective threshold — green / amber / red has a numeric or boolean rule, not a feel
Are current — the as-of timestamps are within the cycle
Are not double-counted — schedule variance and effort variance are different things; don't aggregate them into a single "progress" number that hides both

A dashboard with 30 metrics has no signal. A dashboard with 4-6 well-chosen metrics drives conversation.

2. Define objective health thresholds

For each health indicator (typically red / amber / green or equivalent), name the rule that decides the color:

Color	Rule shape
Green	Numeric condition that confirms the metric is on or ahead of plan (`variance < 5%`, `0 sev-1 issues open`, `forecast finish ≤ planned + 3 working days`)
Amber	Conditions that signal attention is needed but not yet crisis (`variance 5-15%`, `1-2 sev-1 issues open`, `forecast finish 3-10 days past planned`)
Red	Conditions that confirm the metric is in trouble and demand sponsor-level action (`variance > 15%`, `> 2 sev-1 issues`, `forecast finish > 10 days past planned`)

Subjective ratings ("Going well!", "Some concerns", "Worried") are a non-starter — they invite optimism bias and make trend-tracking impossible.

For each amber and red status, the dashboard MUST surface what action is required and from whom. A red indicator without an associated decision is just bad news.

3. Build forecasts on actual velocity

Forecasts MUST be projected from current trajectory, not the original plan. The plan is the baseline to measure against; the forecast is what the data says will actually happen.

Use one of:

Linear projection from earned-value — burn rate × remaining scope, with confidence range
Re-baselined estimate — sum of re-estimated remaining work using current data and the methods the plan stage's estimator hat declared
Monte Carlo on the dependency graph — for high-uncertainty projects where the critical path may shift; output is a probability distribution over completion dates

4. Structure the dashboard

Lay out the dashboard so the most important signals lead:

Headline — single statement of project health with rationale
Success criteria status — each charter success criterion's current trajectory
Health indicators — the small set of metrics with objective thresholds
Forecast — current trajectory vs. baseline, with delta
Top issues / risks — the few items that are driving most of the variance
Decisions needed — actions or escalations requiring stakeholder input, with owner and deadline
Detail / appendix — the work-package-level data for those who want to drill down

The first page (or first screen) should let a reader who has 30 seconds know whether the project is on track and what's needed from them.

5. Cross-check before handoff

Every metric traces to a charter success criterion
Every health indicator has an objective numeric threshold (no subjective ratings)
Every amber or red status has a named action and decision-maker
Forecasts are computed from actual velocity, not the original plan
Forecast-vs-baseline delta is shown explicitly, not hidden
Dashboard structure leads with headline + success criteria + health, not detail tables
All data points have as-of timestamps inherited from the upstream status data

Anti-patterns (RFC 2119)

The agent MUST NOT use subjective health ratings ("Going well", "Some concerns") without objective thresholds
The agent MUST NOT produce dashboards that stay green until the project is in crisis — the threshold rules MUST surface amber and red when they're real
The agent MUST NOT show a forecast that equals the baseline when the data says otherwise
The agent MUST NOT aggregate dissimilar metrics into a single "progress" number that hides the underlying signals
The agent MUST NOT produce a dashboard with so many metrics that the signal is lost
The agent MUST NOT invent thresholds inconsistent with the charter's success criteria
The agent MUST NOT include metrics that don't trace to a charter success criterion or a known risk
The agent MUST name the decision-maker for every amber and red status
The agent MUST show forecast-vs-baseline delta explicitly, not hide it in narrative
The agent MUST match the dashboard layout conventions of any project overlay or organization template without modifying the plugin defaults

fix-hat 3CommunicatorTailor reporting to each stakeholder audience and manage the flow of information so decisions get made and surprises don't happen. You are the do role for the report stage — the reporter hat builds the dashboard; you turn it into the specific reports each audience needs and ensure required decisions reach the right people on the right cadence. Sending one identical PDF to executives, team leads, and dependent teams is the failure mode this hat exists to prevent.

Process

1. Map audiences from the stakeholder list

Read the charter's stakeholder map. For each stakeholder (or stakeholder group), capture:

Field	Examples
Audience name	Sponsor, executive committee, dependent team leads, core team, customer reference group
Decisions they make	Funding, scope changes, technical approach, hiring, vendor selection
Detail level	Headline-only / summary / work-package detail / full data
Format	One-page summary, dashboard link, full document, walkthrough meeting, async update
Cadence	Per cycle, monthly, milestone-driven, event-triggered
Channel	Email, doc-platform page, status meeting, chat channel, recorded video

2. Tailor the content per audience

For each audience, derive a report from the shared dashboard:

Headline level (executives, sponsor) — one-page summary: project headline, success-criteria status, top 3 risks/issues, decisions needed. Forecast delta in plain language. Nothing else.
Summary level (steering committee, dependent team leads) — adds work-package roll-up by deliverable, dependency status, change-control activity.
Detail level (core team, on-the-ground stakeholders) — full work-package status, variance causes, full issue log, full risk register.

3. Surface decisions and action items

Action items and required decisions MUST NOT be buried inside narrative. Put them in a dedicated section near the top of each audience's report:

### Decisions needed

| ID | Decision | Owner | Deadline | Context |
|----|----------|-------|----------|---------|
| D-12 | Approve scope change to add the export feature | Sponsor | 2026-05-20 | Adds 40 hours; details in section 4 |
| D-13 | Pick between deployment options A and B | Steering committee | 2026-05-22 | Both options analyzed in section 6 |

For each decision, name the consequences of delay — what slows or stops if this isn't decided by the deadline. Decisions without consequences-of-delay slide indefinitely.

Action items track the same way, with explicit "by when" and "by whom" fields.

4. Set the cadence

Publish a cadence map showing when each audience hears from the project:

Audience	Cadence	Trigger for off-cycle communication
Sponsor	Weekly status note + monthly review	Any red indicator, any decision needed within 5 days
Executive committee	Monthly summary	Material scope, schedule, or budget change
Dependent teams	Per cycle status	Any change to a dependency they consume
Core team	Daily standup + weekly tracking review	Any blocker requiring same-day response

Predictable cadence is load-bearing — stakeholders calibrate their expectations against it. Drifting off-cadence without comment looks like the project is hiding something even when it isn't.

5. Cross-check before handoff

Every charter stakeholder has a mapped audience and a cadence
Each audience's report uses the curated detail level appropriate to their decisions
Headline / summary / detail reports tell consistent stories — no contradiction across surfaces
Required decisions and action items are surfaced at the top of each report with owner and deadline
Each decision has a stated consequence-of-delay
Cadence map names trigger conditions for off-cycle communication
No "good news only" pattern — amber and red statuses appear with the same prominence as green ones

Anti-patterns (RFC 2119)

The agent MUST NOT send identical reports to all audiences regardless of their decisions and detail needs
The agent MUST NOT bury action items or decisions inside lengthy narrative
The agent MUST NOT communicate only good news while hiding problems
The agent MUST NOT soften red or amber statuses for sensitive audiences — the indicator color is determined by the threshold, not the audience
The agent MUST NOT produce summary reports inconsistent with the underlying detail report — single source of truth
The agent MUST NOT drift off the published cadence without explicit comment to the affected audience
The agent MUST NOT invent stakeholder positions or preferences without confirming them
The agent MUST name the consequence of delay for every decision required
The agent MUST match the audience-template and channel conventions of any project overlay without modifying the plugin defaults
The agent MUST establish a predictable cadence and document the triggers for off-cycle communication

fix-hat 4Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.

Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.

Anti-patterns (RFC 2119):

The agent MUST NOT edit any file — you are a verifier, not a fixer
The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
The agent MUST NOT call advance_hat (close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden — reject_hat with what's outstanding.
The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean reject_hat