Health Check
Ask gateMonitor account health, identify risks, and create action plans
Health Check
Read account health across multiple dimensions and convert that read into ranked risks with concrete mitigation plans. This stage is the lifecycle's early-warning system — it tells the studio which accounts are healthy enough to grow and which need intervention before renewal is at stake.
Scope
Scoring health on evidence, separating leading from lagging churn signals, and producing a ranked mitigation plan. Health-check decides where each account stands and what to do about the risks — it does not grow product usage (adoption) or qualify expansion opportunities (expansion).
What to do
- Rate each health dimension against explicit evidence, and show the trend versus the prior period.
- Pull external signals — support volume, sentiment, stakeholder access — alongside usage, not in place of it.
- Separate leading indicators from lagging ones, and rank risks by severity and reversibility.
- Give every mitigation an owner and a measurable success criterion.
What NOT to do
- Don't design or run adoption plays — that's the adoption stage.
- Don't qualify or pursue expansion opportunities — that's expansion.
- Don't rate a dimension without the evidence behind the rating.
- Don't surface a risk without a concrete, owned mitigation.
How the engine runs this stage
1Elaborate
autonomous · plan the work, fan out discovery, declare outputsInputs consumed
Discovery fan-out
knowledge artifactHealth ReportDocument the account's overall health assessment with risk analysis. This output feeds downstream stages (expansion, renewal) with a clear picture of account stability.
Health Report
Document the account's overall health assessment with risk analysis. This output feeds downstream stages (expansion, renewal) with a clear picture of account stability.
Content Guide
Structure the report around health dimensions:
- Health score summary — overall rating (healthy/at-risk/critical) with confidence level
- Dimensional assessment — usage health, engagement health, support health, stakeholder health, contract alignment
- Risk register — severity-ranked churn indicators with leading vs lagging classification
- Root cause analysis — underlying drivers for any unhealthy dimensions
- Mitigation plans — specific actions per risk with owners, success criteria, and priority
- Escalation items — issues requiring immediate leadership attention
- Expansion readiness — whether the account is healthy enough to pursue growth
Quality Signals
- Every health dimension has a rating backed by specific evidence
- Risks distinguish between leading indicators (predictive) and lagging indicators (reactive)
- Mitigation plans have measurable success criteria, not just action items
- Expansion readiness assessment is honest — unhealthy accounts are flagged, not glossed over
Phase guidance
phase overrideELABORATION- "Health scorecard rates account across at least 5 dimensions (usage, engagement, support sentiment, stakeholder access, contract alignment) with evidence for each rating"
Health Check Stage — Elaboration
Criteria Guidance
Good criteria — concrete and verifiable
- "Health scorecard rates account across at least 5 dimensions (usage, engagement, support sentiment, stakeholder access, contract alignment) with evidence for each rating"
- "Risk assessment identifies specific churn indicators with severity levels and mitigation timelines"
- "Action plan includes owner assignments and measurable success criteria for each remediation item"
Bad criteria — vague (no clear check)
- "Account health is assessed"
- "Risks are identified"
- "Action plan exists"
Outputs produced
output templateHealth ReportScored account health assessment with risk identification and mitigation plans.
Health Report
Scored account health assessment with risk identification and mitigation plans.
Expected Artifacts
- Health scorecard -- account rated across multiple dimensions (usage, engagement, support sentiment, stakeholder access, contract alignment)
- Risk assessment -- churn indicators identified with severity levels and leading indicators
- Mitigation plans -- concrete actions for each risk with success criteria and owners
- Escalation paths -- documented procedures for critical risks
Quality Signals
- Health assessment covers at least 5 dimensions with evidence for each rating
- Each risk has a concrete mitigation plan with measurable success criteria
- Report includes a clear verdict: healthy, at-risk, or critical
- Escalation paths are documented for critical risks
2Review
pre-execute · agents audit the planned spec before any code landsreview agentRisk AccuracyThe agent **MUST** verify health scores accurately reflect account risk and that risks have actionable mitigation. Mis-rated accounts are the most expensive failure mode in customer success: the at-risk account treated as green becomes the surprise churn next cycle. This lens catches scoring drift before it hides the real picture.
Mandate: The agent MUST verify health scores accurately reflect account risk and that risks have actionable mitigation. Mis-rated accounts are the most expensive failure mode in customer success: the at-risk account treated as green becomes the surprise churn next cycle. This lens catches scoring drift before it hides the real picture.
Check
The agent MUST verify, and file feedback for any violation:
- At least five dimensions rated —
HEALTH-REPORT.mdrates the account across at least five dimensions (usage, engagement, support sentiment, stakeholder access, contract alignment). Anything fewer is a finding regardless of how confident the rating looks. - Every rating has cited evidence — No rating without a source. Vibes-based scoring is the highest-priority drift to catch.
- Trend, not point-in-time — Every dimension shows direction versus the prior period. A falling green and a stable yellow are different problems and require different responses.
- Silent signals rated
unknown— Missing telemetry, missing interactions, missing stakeholder contact all rate asunknown(treated as yellow minimum), not as green by default. - Leading vs. lagging indicators separated — The risk section distinguishes leading indicators (predictive) from lagging indicators (already happened). A flat list of "risks" without separation is a finding.
- Severity and reversibility ranked separately — Each risk has both ratings, not a single collapsed score. They drive different responses: high-severity / one-way risks are different from high-severity / easy-to-reverse risks.
- Mitigation has owner, success criterion, and escalation — Medium and high-severity risks all have a mitigation plan with a named owner role (not "the team"), a measurable success criterion with a window, and an escalation path.
- Access gaps surfaced where they block mitigation — Mitigations that require an unavailable stakeholder are flagged with the access gap as a precondition.
- One top risk surfaced — A long list with no named top risk leaves the next stage with no baton. The report must surface the single highest-priority risk explicitly.
Common failure modes to look for
- A health report whose ratings collapse to "the team feels good about this account" — no specific evidence cited per dimension
- A silent account rated green because no negative signals appeared
- A risk list ordered by when each risk was noticed rather than by severity / reversibility
- A mitigation owned by "the team" or "CS" rather than a named role
- A "new" risk that has actually been open in prior reports and never closed — chronic risk masquerading as fresh
- A mitigation proposed against a stakeholder the team has not been able to reach for months, without flagging the access gap
- Recency bias: a single recent positive interaction overriding a quarter of declining signals
3Execute
per-unit baton · Health Monitor → Risk Analyst → Verifierhat 1Health MonitorPlan the health read for this unit — assess account health across multiple dimensions, evidence each rating, and surface the trend versus the prior period. You are the plan role for the health-check stage. Your output is the scorecard half of `HEALTH-REPORT.md`; the risk analyst follows you with the risk-and-mitigation half.
Focus: Plan the health read for this unit — assess account health across multiple dimensions, evidence each rating, and surface the trend versus the prior period. You are the plan role for the health-check stage. Your output is the scorecard half of HEALTH-REPORT.md; the risk analyst follows you with the risk-and-mitigation half.
Process
1. Read your inputs
- The upstream
USAGE-REPORT.mdfrom the adoption stage — the foundation for any usage-dimension rating - Any external signals available for this account: support volume and sentiment, stakeholder access and engagement, contract status, executive interactions, escalation history
- Prior
HEALTH-REPORT.mdslices for the same account — to read trend, not point-in-time - The intent's decision register — prior decisions about scoring methodology, dimension weightings, alerting thresholds
2. Choose dimensions (minimum five)
Rate the account on at least five dimensions. Standard set, in priority order:
- Usage: depth and breadth of product use against the adoption targets
- Engagement: quality of working relationship (CSM cadence kept, training attended, advisory participation)
- Support sentiment: support ticket volume, severity, repeat issues, escalation pattern
- Stakeholder access: which named stakeholders the team has access to, and whether champion / sponsor / economic buyer are all reachable
- Contract alignment: is the customer using what they pay for? Is what they pay for what they need?
Add dimensions specific to the studio's domain when relevant (community participation, advocacy, security posture) — but never drop below five.
3. Rate each dimension with evidence, not vibes
For each dimension, produce a row in a scorecard table. The rating MUST be backed by a specific piece of evidence — a metric reading, a named interaction, a documented event. "Feels good" is not evidence.
| Dimension | Rating | Evidence | Trend vs. prior | Source |
|---|---|---|---|---|
| Usage | green / yellow / red | specific metric reading or workflow signal | up / flat / down | data source / interaction date |
Rating scale: keep it simple (3-tier green / yellow / red, or a 1–5 score). Whichever scale the project overlay establishes wins; the plugin default is the 3-tier color.
4. Read silent accounts carefully
A silent account is not healthy by default — it might be deeply engaged, or it might be quietly leaving. For any dimension where the signal is missing (no usage data, no recent interaction, no stakeholder contact), record the rating as unknown — <reason> and treat unknown as a yellow at minimum. Silent on stakeholder access especially: if the team has not reached the champion in 90 days, that is a signal.
5. Read trend, not just point-in-time
For every dimension, show the direction versus the prior period. Two accounts with the same point-in-time rating but opposite trends require different responses — the falling green is more urgent than the stable yellow.
6. Roll up to a holistic score
After every dimension is rated, write a one-paragraph holistic read: which dimensions dominate the picture, where they agree, where they conflict, and whether the overall direction is improving, stable, or deteriorating. The holistic read is not the average of the ratings — it is the analyst's interpretation of which dimensions matter most for this customer right now.
7. Hand off to the risk analyst
Declare what the risk analyst must build on:
- Which dimensions to focus the risk read on (which dimensions are yellow / red and trending the wrong way)
- Any leading indicators you noticed but didn't classify (the analyst will rank them)
- Stakeholder access gaps that block specific risk-mitigation options
8. Self-check before handing off
- At least five dimensions are rated
- Every rating has cited evidence — no rating without a source
- Every dimension shows trend versus the prior period
- Silent / missing signals are rated
unknownwith a reason, not assumed green - The holistic read is written and identifies which dimensions dominate
- The handoff to the risk analyst names focus dimensions and access gaps
Anti-patterns (RFC 2119)
- The agent MUST NOT rely on a single metric (NPS, login count) as a proxy for overall health
- The agent MUST NOT rate a dimension without naming the specific evidence behind the rating
- The agent MUST NOT assess health at a single point in time without showing trend
- The agent MUST NOT assume a silent account is healthy — rate it
unknownand treat as yellow at minimum - The agent MUST NOT average the dimension ratings into a single number and call that the holistic read
- The agent MUST NOT drop below five dimensions to make the read fit
- The agent MUST NOT invent dimension weightings the project overlay has not declared — use the default scale until overlaid
- The agent MUST capture qualitative signals (stakeholder sentiment, executive engagement) alongside quantitative metrics
- The agent MUST name access gaps that constrain downstream risk mitigation options
hat 2Risk AnalystConvert the health monitor's scorecard into a ranked, actionable risk read — identify churn indicators (leading and lagging, separated), rank by severity and reversibility, and write a mitigation plan with owners and measurable success criteria. You are the do role for the health-check stage. Your output is the risk-and-mitigation half of `HEALTH-REPORT.md`.
Focus: Convert the health monitor's scorecard into a ranked, actionable risk read — identify churn indicators (leading and lagging, separated), rank by severity and reversibility, and write a mitigation plan with owners and measurable success criteria. You are the do role for the health-check stage. Your output is the risk-and-mitigation half of HEALTH-REPORT.md.
Process
1. Read your inputs
- The monitor's scorecard half of
HEALTH-REPORT.mdfor this unit — every dimension, every rating, every piece of cited evidence - The handoff: focus dimensions, leading indicators the monitor flagged but did not classify, named stakeholder-access gaps
- Sibling units' risk reads in the same intent — to keep risk taxonomies consistent and avoid double-counting cross-account risks
- Prior
HEALTH-REPORT.mdrisk sections for this account — to avoid declaring a "new" risk that has been open for two cycles
2. Separate leading from lagging indicators
Leading indicators predict churn before it is irreversible. Lagging indicators confirm churn has started. Both are real, but they support different responses. Build two lists:
- Leading indicators: declining usage in a previously-active segment, support tickets clustering in a previously-stable area, champion job change, executive sponsor going silent, a budget freeze announced in the customer's industry
- Lagging indicators: an active escalation, a stated intent to evaluate alternatives, a missed renewal date, a contracted-but-unused module, a stalled expansion that was previously qualified
If an indicator could be either, prefer leading — leading-misclassified-as-lagging is the more dangerous error.
3. Rank each risk by severity and reversibility
For every identified risk, produce a row in the risk table:
| Risk | L/L | Severity | Reversibility | Time to act | Source |
|---|---|---|---|---|---|
| named risk | leading / lagging | high / medium / low | easy / hard / one-way | now / this cycle / monitor | which evidence in scorecard / external signal |
Severity is the magnitude of the impact if the risk fires. Reversibility is how recoverable the impact is — a champion leaving is hard to reverse; an integration outage is easy. Time-to-act ranks the queue.
4. Write a mitigation plan per high-severity risk
For each risk rated medium or high severity, write a mitigation plan with:
- The objective — what stops being true if the mitigation works (the risk is closed, downgraded to low, or made acceptable)
- The action — one concrete sequence the team will run
- The owner — a named role responsible (not "the team")
- The success criterion — a measurable signal that the mitigation worked, with a window
- The escalation path — who is told and when if the mitigation fails
Low-severity risks do not need a full plan — list them in a monitor-and-revisit section.
5. Surface the one risk that matters most
After ranking, name the single highest-priority risk explicitly: "The risk this account most needs the team to act on this cycle is X, because Y." A long list with no surfaced top item is how triage stays paralyzed. The named top risk is the baton into the next stage's input.
6. Tie back to access gaps
The monitor flagged stakeholder-access gaps that constrain mitigation options. For every high-severity risk whose mitigation requires an unavailable stakeholder, add a Blocked by access gap row and name the access work that has to come first. Don't propose a mitigation that requires the champion if the champion has been silent for 90 days.
7. Self-check before handing off
- Every dimension the monitor rated yellow / red has at least one risk row
- Indicators are separated into leading and lagging
- Every risk is rated for both severity and reversibility (separately, not collapsed)
- Every medium / high risk has a mitigation with objective, action, named owner, success criterion, and escalation path
- The single highest-priority risk is named explicitly
- Access-gap-blocked mitigations are flagged, not silently proposed
- No risk is declared "new" if it has been open in a prior cycle without resolution
Anti-patterns (RFC 2119)
- The agent MUST NOT identify risks only after the customer has escalated — leading indicators are the bar
- The agent MUST NOT list risks without severity and reversibility ranking
- The agent MUST NOT treat all risks as equally urgent — that is how triage breaks
- The agent MUST NOT collapse severity and reversibility into a single score — they drive different responses
- The agent MUST NOT propose mitigations without owners, success criteria, or escalation paths
- The agent MUST NOT propose a mitigation that requires a stakeholder the team can't reach without flagging the access gap first
- The agent MUST NOT re-declare a risk as "new" if a prior cycle's report already named it
- The agent MUST NOT end the read without surfacing the single highest-priority risk
- The agent MUST distinguish leading from lagging indicators — they drive different responses
- The agent MUST name an owner role (CSM, executive sponsor, product), not "the team"
hat 3VerifierValidate the per-unit knowledge artifact for the health-check stage of customer-success. Units here are health-signal report — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Focus: Validate the per-unit knowledge artifact for the health-check stage of customer-success. Units here are health-signal report — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Anti-patterns (RFC 2119):
- The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
- The agent MUST NOT validate against frontmatter schema,
depends_on:resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities. - The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
- The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
- The agent MUST name a specific failed criterion in any rejection.
- The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.
Validate this unit's outputs against its criteria
List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.
What you check (BODY ONLY)
1. Artifact answers its topic
The unit's title and first paragraph define the topic. The remaining body MUST deliver substantive content on that topic. Reject placeholders, content-free outlines, or redirects.
2. Sources cited
Non-trivial claims (numbers, market signals, system behavior, stakeholder positions) MUST cite specific sources — URL, doc path, dated stakeholder conversation, named standard. Reject "industry common knowledge" or unsourced numerical claims.
3. Internal consistency
Title, mission, and body must align. Numerical/categorical claims must be consistent across the body. Recommendations must follow from the evidence presented.
4. Decision-register consistency
The unit must not propose, default to, or assume an option that contradicts a recorded Decision. Cite the Decision ID in any rejection.
5. Open questions accounted for
Every "Open Questions" entry must be answered, defaulted with veto-style approval, OR flagged (needs human escalation).
4Approve
post-execute · the same agents re-run against the built workThe agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.
approval agentRisk AccuracyThe agent **MUST** verify health scores accurately reflect account risk and that risks have actionable mitigation. Mis-rated accounts are the most expensive failure mode in customer success: the at-risk account treated as green becomes the surprise churn next cycle. This lens catches scoring drift before it hides the real picture.
Mandate: The agent MUST verify health scores accurately reflect account risk and that risks have actionable mitigation. Mis-rated accounts are the most expensive failure mode in customer success: the at-risk account treated as green becomes the surprise churn next cycle. This lens catches scoring drift before it hides the real picture.
Check
The agent MUST verify, and file feedback for any violation:
- At least five dimensions rated —
HEALTH-REPORT.mdrates the account across at least five dimensions (usage, engagement, support sentiment, stakeholder access, contract alignment). Anything fewer is a finding regardless of how confident the rating looks. - Every rating has cited evidence — No rating without a source. Vibes-based scoring is the highest-priority drift to catch.
- Trend, not point-in-time — Every dimension shows direction versus the prior period. A falling green and a stable yellow are different problems and require different responses.
- Silent signals rated
unknown— Missing telemetry, missing interactions, missing stakeholder contact all rate asunknown(treated as yellow minimum), not as green by default. - Leading vs. lagging indicators separated — The risk section distinguishes leading indicators (predictive) from lagging indicators (already happened). A flat list of "risks" without separation is a finding.
- Severity and reversibility ranked separately — Each risk has both ratings, not a single collapsed score. They drive different responses: high-severity / one-way risks are different from high-severity / easy-to-reverse risks.
- Mitigation has owner, success criterion, and escalation — Medium and high-severity risks all have a mitigation plan with a named owner role (not "the team"), a measurable success criterion with a window, and an escalation path.
- Access gaps surfaced where they block mitigation — Mitigations that require an unavailable stakeholder are flagged with the access gap as a precondition.
- One top risk surfaced — A long list with no named top risk leaves the next stage with no baton. The report must surface the single highest-priority risk explicitly.
Common failure modes to look for
- A health report whose ratings collapse to "the team feels good about this account" — no specific evidence cited per dimension
- A silent account rated green because no negative signals appeared
- A risk list ordered by when each risk was noticed rather than by severity / reversibility
- A mitigation owned by "the team" or "CS" rather than a named role
- A "new" risk that has actually been open in prior reports and never closed — chronic risk masquerading as fresh
- A mitigation proposed against a stakeholder the team has not been able to reach for months, without flagging the access gap
- Recency bias: a single recent positive interaction overriding a quarter of declining signals
5Gate
controls advancement to the next stageA local review UI opens; a human approves or requests changes via the review tool.
Fix loop
a separate track · Classifier → Health Monitor → Feedback AssessorNot a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.
fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's
Classifier (feedback triage)
You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.
What you do
-
Read the FB body via
haiku_feedback_read { intent, stage, feedback_id }. -
Read the stage's unit list via
haiku_unit_list { intent, stage }. -
Decide:
target_unit— which unit this FB counter-signals.- If the body names or describes a specific unit's output, set that unit's slug.
- If the body is cross-cutting (touches every unit, or speaks to
the stage's deliverables as a whole), set
null(intent-scope). - When in doubt:
null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
target_invalidates— which approval roles get cleared on closure. Default rule of thumb:user-chat/user-visual/user-questionorigins →["user"](the human will re-review).adversarial-review/studio-revieworigins →[<filer-agent-name>](the originating reviewer re-runs).driftorigin →["user"](drift always escalates to human).agentorigin →[](informational; no rerun).
-
Call
haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes thetarget_unit/target_invalidatesrouting only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance. -
Decide severity and call
haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returnsseverity_already_setand you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
-
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only
reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself:haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB asnon_actionable(acknowledged, valid, no code fix) — distinct fromhaiku_feedback_reject(which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step. -
Otherwise, call
haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" }to hand off to the next fix-hat. Themessageis the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_writeis refused). Your reasoning lives in the handoffmessage.
What you do NOT do
- You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
- You do NOT call
haiku_feedback_reject— that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is theresolution: "non_actionable"shortcut in step 6 — that's an acknowledgement, not a rejection.) - You do NOT spawn subagents. The classification is a single read + single write + advance.
Why this hat exists
Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.
fix-hat 2Health MonitorPlan the health read for this unit — assess account health across multiple dimensions, evidence each rating, and surface the trend versus the prior period. You are the plan role for the health-check stage. Your output is the scorecard half of `HEALTH-REPORT.md`; the risk analyst follows you with the risk-and-mitigation half.
Focus: Plan the health read for this unit — assess account health across multiple dimensions, evidence each rating, and surface the trend versus the prior period. You are the plan role for the health-check stage. Your output is the scorecard half of HEALTH-REPORT.md; the risk analyst follows you with the risk-and-mitigation half.
Process
1. Read your inputs
- The upstream
USAGE-REPORT.mdfrom the adoption stage — the foundation for any usage-dimension rating - Any external signals available for this account: support volume and sentiment, stakeholder access and engagement, contract status, executive interactions, escalation history
- Prior
HEALTH-REPORT.mdslices for the same account — to read trend, not point-in-time - The intent's decision register — prior decisions about scoring methodology, dimension weightings, alerting thresholds
2. Choose dimensions (minimum five)
Rate the account on at least five dimensions. Standard set, in priority order:
- Usage: depth and breadth of product use against the adoption targets
- Engagement: quality of working relationship (CSM cadence kept, training attended, advisory participation)
- Support sentiment: support ticket volume, severity, repeat issues, escalation pattern
- Stakeholder access: which named stakeholders the team has access to, and whether champion / sponsor / economic buyer are all reachable
- Contract alignment: is the customer using what they pay for? Is what they pay for what they need?
Add dimensions specific to the studio's domain when relevant (community participation, advocacy, security posture) — but never drop below five.
3. Rate each dimension with evidence, not vibes
For each dimension, produce a row in a scorecard table. The rating MUST be backed by a specific piece of evidence — a metric reading, a named interaction, a documented event. "Feels good" is not evidence.
| Dimension | Rating | Evidence | Trend vs. prior | Source |
|---|---|---|---|---|
| Usage | green / yellow / red | specific metric reading or workflow signal | up / flat / down | data source / interaction date |
Rating scale: keep it simple (3-tier green / yellow / red, or a 1–5 score). Whichever scale the project overlay establishes wins; the plugin default is the 3-tier color.
4. Read silent accounts carefully
A silent account is not healthy by default — it might be deeply engaged, or it might be quietly leaving. For any dimension where the signal is missing (no usage data, no recent interaction, no stakeholder contact), record the rating as unknown — <reason> and treat unknown as a yellow at minimum. Silent on stakeholder access especially: if the team has not reached the champion in 90 days, that is a signal.
5. Read trend, not just point-in-time
For every dimension, show the direction versus the prior period. Two accounts with the same point-in-time rating but opposite trends require different responses — the falling green is more urgent than the stable yellow.
6. Roll up to a holistic score
After every dimension is rated, write a one-paragraph holistic read: which dimensions dominate the picture, where they agree, where they conflict, and whether the overall direction is improving, stable, or deteriorating. The holistic read is not the average of the ratings — it is the analyst's interpretation of which dimensions matter most for this customer right now.
7. Hand off to the risk analyst
Declare what the risk analyst must build on:
- Which dimensions to focus the risk read on (which dimensions are yellow / red and trending the wrong way)
- Any leading indicators you noticed but didn't classify (the analyst will rank them)
- Stakeholder access gaps that block specific risk-mitigation options
8. Self-check before handing off
- At least five dimensions are rated
- Every rating has cited evidence — no rating without a source
- Every dimension shows trend versus the prior period
- Silent / missing signals are rated
unknownwith a reason, not assumed green - The holistic read is written and identifies which dimensions dominate
- The handoff to the risk analyst names focus dimensions and access gaps
Anti-patterns (RFC 2119)
- The agent MUST NOT rely on a single metric (NPS, login count) as a proxy for overall health
- The agent MUST NOT rate a dimension without naming the specific evidence behind the rating
- The agent MUST NOT assess health at a single point in time without showing trend
- The agent MUST NOT assume a silent account is healthy — rate it
unknownand treat as yellow at minimum - The agent MUST NOT average the dimension ratings into a single number and call that the holistic read
- The agent MUST NOT drop below five dimensions to make the read fit
- The agent MUST NOT invent dimension weightings the project overlay has not declared — use the default scale until overlaid
- The agent MUST capture qualitative signals (stakeholder sentiment, executive engagement) alongside quantitative metrics
- The agent MUST name access gaps that constrain downstream risk mitigation options
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.
Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.
Anti-patterns (RFC 2119):
- The agent MUST NOT edit any file — you are a verifier, not a fixer
- The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
- The agent MUST NOT call
advance_hat(close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden —reject_hatwith what's outstanding. - The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
- The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
- The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean
reject_hat