Training · stage 1 of 5

Needs Analysis

Auto gate

Conduct skills gap analysis and define learning objectives

Needs Analysis

The upstream knowledge stage for the entire training lifecycle: establish whether training is even the right intervention, who needs it, and what they must be able to do at the end. If the needs are wrong here, every later stage delivers the wrong program well.

Scope

Diagnosing the gap and defining objectives: performance data, audience profile, competency baseline, the gap between current and target performance, and whether training (vs process, tooling, or hiring) is the right lever. Needs-analysis decides what problem the program must solve and for whom — not how the curriculum is structured (design) or built (develop).

What to do

Quantify the gap between current and target performance with real data, not assumed deficiency.
Confirm training is actually the right lever before recommending a program — sometimes the answer is process or tooling.
Profile the audience well enough that design can make modality and sequencing choices.
Write learning objectives concrete enough that evaluate can later measure against them.

What NOT to do

Don't design curriculum structure, module sequence, or assessment strategy — that's design.
Don't author materials or build content.
Don't recommend training when the evidence points to a non-training cause of the gap.
Don't leave the target outcome so vague that no later stage can check against it.

How the engine runs this stage

1Elaborate

collaborative · plan the work, fan out discovery, declare outputs

Discovery fan-out

knowledge artifactNeeds AssessmentSkills gap analysis, learning objectives, and target audience profile.

Needs Assessment

Skills gap analysis, learning objectives, and target audience profile.

Content Guide

Structure the assessment for curriculum design:

Current state -- existing competency levels across target roles and skills
Target state -- required competency levels with business justification
Gap analysis -- quantified gaps between current and target with priority ranking
Learning objectives -- specific, observable objectives following Bloom's taxonomy
Target audience -- demographics, prior knowledge, learning preferences, and constraints
Intervention validation -- confirmation that training is the appropriate solution

Quality Signals

Gaps are quantified using data (assessments, performance metrics, surveys)
Learning objectives use observable verbs with measurable outcomes
Training is validated as the right intervention, not assumed
Stakeholder input includes learners, managers, and subject matter experts

Phase guidance

phase overrideELABORATION- "Skills gap analysis maps current competency levels against target levels for each role with measurable gaps"

Needs Analysis Stage — Elaboration

Criteria Guidance

Good criteria — concrete and verifiable

"Skills gap analysis maps current competency levels against target levels for each role with measurable gaps"
"Learning objectives follow Bloom's taxonomy with specific, observable verbs and measurable outcomes"
"Needs assessment includes stakeholder input from at least 3 sources: learners, managers, and subject matter experts"

Bad criteria — vague (no clear check)

"Needs are assessed"
"Gaps are identified"
"Objectives are defined"

Outputs produced

output templateNeeds AssessmentSkills gap analysis with quantified gaps, learning objectives, and audience profile.

Needs Assessment

Skills gap analysis with quantified gaps, learning objectives, and audience profile.

Expected Artifacts

Skills gap analysis -- current vs target competency levels for each role with measurable gaps
Learning objectives -- following Bloom's taxonomy with observable verbs and measurable outcomes
Audience profile -- target learners profiled with existing knowledge and learning preferences
Stakeholder input -- validated needs from learners, managers, and subject matter experts

Quality Signals

Gap measurements are validated against multiple data sources
Learning objectives use specific, observable verbs from Bloom's taxonomy
Objectives align with business goals and are achievable within constraints
At least 3 stakeholder sources provide input (learners, managers, SMEs)

2Review

pre-execute · agents audit the planned spec before any code lands

review agentValidityThe agent **MUST** verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.

Mandate: The agent MUST verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.

Check

The agent MUST verify and file feedback for any violation:

Evidence behind every gap — every quantified gap MUST cite a specific source (data point, dated stakeholder input by role, named system telemetry). Gaps backed only by "the team thinks" or "common knowledge" are findings.
Knowledge / skill / will classification — every gap MUST be classified as knowledge, skill, or will / system. Unclassified gaps mean the consultant hat couldn't credibly choose an intervention.
Intervention check — training MUST be explicitly confirmed as the right lever versus process change, tooling change, hiring, or management coaching. A needs assessment that recommends training without this check is a finding.
Audience profile completeness — the audience MUST be profiled with population, role, size, and the constraints that affect modality choice (geographic distribution, accessibility, time-on-job, technology access).
Bloom-aligned learning objectives — every objective MUST use an action verb that names a concrete observable behavior. Understand, know, be aware of, and cover are findings, not objectives.
Gap-to-objective trace — every priority-1 gap MUST have at least one objective covering it; every objective MUST trace back to a specific gap. Orphan objectives and unaddressed gaps are both findings.
Organizational readiness — the assessment MUST address whether managers will reinforce the new behavior, whether learners have the conditions to apply it, and whether the surrounding system supports the change.

Common failure modes to look for

A target performance defined as a topic list (knows about X) rather than observable behavior.
A gap quantified as a single percentage averaged across heterogeneous skills.
A will / system gap with a training recommendation attached anyway.
A modality recommendation that's not justified against the audience profile.
A learning objective that names a topic (databases) instead of a behavior (design a normalized schema for a transactional workload).
A needs assessment that confirms the gap but doesn't address whether the system around the learner will support new behavior.

3Execute

per-unit baton · Analyst → Consultant → Verifier

hat 1AnalystQuantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.

Focus: Quantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.

Process

1. Establish the target

Before measuring anything, name what "good" looks like for the role / audience. Pull from:

Role definition — job description, role expectations, performance standards, named competency framework if one exists for this organization
Strategic context — what the business is trying to accomplish that this role contributes to
Subject-matter input — a senior practitioner's description of what mastery looks like in this role

Capture the target as a set of observable behaviors (can do X under condition Y to standard Z), not as a list of topics ("knows about authentication"). Behaviors are measurable; topics are not.

2. Establish the current state

Use evidence, not assumption. Acceptable sources:

Performance data already collected (assessment scores, completion rates, quality metrics, error rates, support tickets, customer-satisfaction scores tied to the role's outputs)
Direct assessment (skills test, work sample review, observation of practice)
Structured stakeholder input — surveys or interviews with learners, their managers, and named subject-matter experts; cite each source by date and role
Existing system / process telemetry where it credibly reflects role performance

If the only "evidence" available is "the manager thinks the team isn't strong on X", capture it as an opinion not as data, and flag the absence of harder evidence in the report.

3. Quantify the gap

Per behavior in the target, write current-state evidence alongside target-state expectation. Express the gap as concretely as the evidence allows:

Target behavior	Current evidence	Gap
verbatim target	data point + source + date	delta, with units when possible

Don't average gaps across heterogeneous behaviors — a 20% gap in one skill plus a 5% gap in another is not a "12.5% overall gap". Keep behaviors separate.

4. Distinguish knowledge gap from skill gap from will gap

These three failure modes look identical from the outside and respond to entirely different interventions:

Knowledge gap — the learner doesn't know the thing. Training can fix this.
Skill gap — the learner knows the thing but can't reliably perform it. Training plus practice can fix this.
Will / system gap — the learner knows it, can do it, and isn't doing it because of incentive, tooling, process, or culture. Training will NOT fix this; a process / tooling / management change might.

For every quantified gap, flag which type the evidence supports. The consultant hat depends on this classification.

5. Prioritize

Stack-rank the gaps by business impact × learning feasibility. A high-impact gap that the audience can plausibly close with training is the highest priority. A high-impact gap that's actually a process gap goes to the consultant for a non-training recommendation, not to the priority list.

Format guidance

Write the unit body in this structure:

Audience — population, role, size, relevant constraints (geographic, accessibility, technology, time-on-job).
Target performance — observable behaviors at the target standard, with citation.
Current performance — evidence per target behavior, with sources and dates.
Gap quantification — the per-behavior table above.
Gap classification — knowledge / skill / will, per behavior, with reasoning.
Prioritized gap list — ranked by impact × feasibility, with the rationale per ranking.
Open questions — anything the consultant hat must resolve before recommending an intervention.

Anti-patterns (RFC 2119)

The agent MUST NOT quantify gaps based on assumptions, anecdote, or "common knowledge" — name a source for every data point.
The agent MUST NOT treat all gaps as equally important. Rank them and justify the ranking.
The agent MUST distinguish knowledge gaps, skill gaps, and will / system gaps; this classification determines whether training is the right intervention at all.
The agent MUST NOT define target behaviors as topic lists (knows about X) — they MUST be observable performance statements.
The agent MUST NOT collapse heterogeneous gaps into a single percentage.
The agent MUST cite stakeholder input by role and date, not as "the team said".
The agent MUST NOT recommend an intervention — that's the consultant hat's job. Stay in evidence mode.
The agent MUST flag absence of evidence rather than fill the gap with assumption.

hat 2ConsultantInterpret the analyst's quantified gap, confirm whether training is the right intervention, and — if it is — recommend modality, intensity, and the learning objectives that frame the design stage. You are the do role for needs analysis. The analyst hands you evidence; you hand the design stage a recommended intervention with named learning objectives.

Focus: Interpret the analyst's quantified gap, confirm whether training is the right intervention, and — if it is — recommend modality, intensity, and the learning objectives that frame the design stage. You are the do role for needs analysis. The analyst hands you evidence; you hand the design stage a recommended intervention with named learning objectives.

Process

1. Re-read the gap classification

Start by checking the analyst's gap classification (knowledge / skill / will). The single most common failure of training programs is solving a will / system gap with a course. Before recommending training, confirm:

Is the gap a knowledge gap? Training is plausible.
Is the gap a skill gap? Training plus structured practice is plausible.
Is the gap a will / system gap? Recommend a NON-training intervention (process change, tooling, management coaching, incentive change). The training studio is the wrong lifecycle for this — your output is the recommendation to stop, not a curriculum.

If you find evidence the analyst's classification is wrong, file an internal note and reject — don't paper over it by recommending training anyway.

2. Confirm organizational readiness

Even when training is the right lever, it can fail because the organization isn't ready to absorb it:

Will managers reinforce the new behavior post-training, or undermine it?
Do learners have the time / tooling / authority to apply what they learn?
Is the system that produces the gap (escalation paths, tooling, incentives) going to support the new behavior?

If the answer to any of these is "no", note it in the recommendation. Training delivered into a hostile system has near-zero transfer to job, regardless of the program's quality.

Given the audience profile from the analyst, choose the delivery modality. The dimensions to consider, in priority order:

Synchronous vs. asynchronous — does the content need real-time feedback, peer interaction, facilitator adaptation? Or is it self-paced reinforcement?
In-person vs. remote — is hands-on practice with physical equipment / co-located peers required, or does the content travel?
Self-paced vs. cohort — does learning benefit from peer comparison and group accountability, or is variable time-to-mastery more important?
Blended — combination of the above, with named handoff points.

Justify the choice against the audience's working pattern, geographic distribution, accessibility needs, and the nature of the skill being built. "We always use [generic modality category]" is not a justification.

4. Write the learning objectives

Learning objectives are the spec the design stage consumes. Write them to Bloom's taxonomy — the action verb names the cognitive level (identify, apply, analyze, evaluate, create). Each objective is one sentence:

By the end of [program / module], [audience] will be able to [observable behavior at the targeted Bloom level], under [condition], to [standard].

Anti-shape to avoid:

Participants will understand X. Learners will be aware of Y. The course covers Z.

Understand, know, be aware of, and cover are not measurable. Replace them with concrete action verbs aligned to the cognitive level the gap requires.

5. Tie objectives back to gaps

Every objective MUST trace to a specific gap in the analyst's quantified list. Every priority-1 gap from the analyst MUST have at least one objective covering it. Surface any mismatch — extra objectives without a backing gap, or gaps without an objective.

Format guidance

Write the unit body in this structure:

Intervention recommendation — training / not training (named alternative) / training + named adjacent intervention. One sentence, then the reasoning.
Readiness assessment — managerial reinforcement, learner conditions, system support. Note any caveat that could undermine transfer.
Modality recommendation — synchronous / asynchronous / in-person / remote / cohort / self-paced / blended, with justification anchored to the audience profile.
Learning objectives — Bloom-aligned, one per line, traceable to a specific gap.
Gap-to-objective trace — table with gap → objective(s) mapping.
Open questions — escalations the design stage will need answered.

Anti-patterns (RFC 2119)

The agent MUST NOT recommend training as the solution for every performance gap. Will / system gaps get non-training recommendations.
The agent MUST confirm organizational readiness; a brilliant program delivered into a hostile system fails.
The agent MUST write learning objectives using Bloom-aligned action verbs that name a concrete observable behavior at the cognitive level the gap requires.
The agent MUST NOT use understand, know, be aware of, or cover — these are not measurable.
The agent MUST trace every learning objective back to a specific gap from the analyst's evidence.
The agent MUST NOT justify modality by team habit or convenience — justify against audience and skill nature.
The agent MUST NOT design the curriculum here; that's the design stage. Your output is the spec the design stage consumes.
The agent MUST flag missing prerequisites or unanswered open questions explicitly rather than paper over them.

hat 3VerifierValidate the per-unit knowledge artifact for the needs-analysis stage of training. Units here are training need — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).

Focus: Validate the per-unit knowledge artifact for the needs-analysis stage of training. Units here are training need — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).

Anti-patterns (RFC 2119):

The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
The agent MUST NOT validate against frontmatter schema, depends_on: resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities.
The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
The agent MUST name a specific failed criterion in any rejection.
The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.

Validate this unit's outputs against its criteria

List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.

What you check (BODY ONLY)

1. Artifact answers its topic

The unit's title and first paragraph define the topic. The remaining body MUST deliver substantive content on that topic. Reject placeholders, content-free outlines, or redirects.

2. Sources cited

Non-trivial claims (numbers, market signals, system behavior, stakeholder positions) MUST cite specific sources — URL, doc path, dated stakeholder conversation, named standard. Reject "industry common knowledge" or unsourced numerical claims.

3. Internal consistency

Title, mission, and body must align. Numerical/categorical claims must be consistent across the body. Recommendations must follow from the evidence presented.

4. Decision-register consistency

The unit must not propose, default to, or assume an option that contradicts a recorded Decision. Cite the Decision ID in any rejection.

5. Open questions accounted for

Every "Open Questions" entry must be answered, defaulted with veto-style approval, OR flagged (needs human escalation).

4Approve

post-execute · the same agents re-run against the built work

The agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.

approval agentValidityThe agent **MUST** verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.

Check

The agent MUST verify and file feedback for any violation:

Evidence behind every gap — every quantified gap MUST cite a specific source (data point, dated stakeholder input by role, named system telemetry). Gaps backed only by "the team thinks" or "common knowledge" are findings.
Knowledge / skill / will classification — every gap MUST be classified as knowledge, skill, or will / system. Unclassified gaps mean the consultant hat couldn't credibly choose an intervention.
Intervention check — training MUST be explicitly confirmed as the right lever versus process change, tooling change, hiring, or management coaching. A needs assessment that recommends training without this check is a finding.
Audience profile completeness — the audience MUST be profiled with population, role, size, and the constraints that affect modality choice (geographic distribution, accessibility, time-on-job, technology access).
Bloom-aligned learning objectives — every objective MUST use an action verb that names a concrete observable behavior. Understand, know, be aware of, and cover are findings, not objectives.
Gap-to-objective trace — every priority-1 gap MUST have at least one objective covering it; every objective MUST trace back to a specific gap. Orphan objectives and unaddressed gaps are both findings.
Organizational readiness — the assessment MUST address whether managers will reinforce the new behavior, whether learners have the conditions to apply it, and whether the surrounding system supports the change.

Common failure modes to look for

A target performance defined as a topic list (knows about X) rather than observable behavior.
A gap quantified as a single percentage averaged across heterogeneous skills.
A will / system gap with a training recommendation attached anyway.
A modality recommendation that's not justified against the audience profile.
A learning objective that names a topic (databases) instead of a behavior (design a normalized schema for a transactional workload).
A needs assessment that confirms the gap but doesn't address whether the system around the learner will support new behavior.

5Gate

controls advancement to the next stage

Auto

The harness advances automatically — no human in the loop at this gate.

Fix loop

a separate track · Classifier → Analyst → Feedback Assessor

Not a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.

fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's

Classifier (feedback triage)

You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.

What you do

Read the FB body via haiku_feedback_read { intent, stage, feedback_id }.
Read the stage's unit list via haiku_unit_list { intent, stage }.
Decide:
- target_unit — which unit this FB counter-signals.
  - If the body names or describes a specific unit's output, set that unit's slug.
  - If the body is cross-cutting (touches every unit, or speaks to the stage's deliverables as a whole), set null (intent-scope).
  - When in doubt: null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
- target_invalidates — which approval roles get cleared on closure. Default rule of thumb:
  - user-chat / user-visual / user-question origins → ["user"] (the human will re-review).
  - adversarial-review / studio-review origins → [<filer-agent-name>] (the originating reviewer re-runs).
  - drift origin → ["user"] (drift always escalates to human).
  - agent origin → [] (informational; no rerun).
Call haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes the target_unit / target_invalidates routing only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance.
Decide severity and call haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returns severity_already_set and you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.
- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself: haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB as non_actionable (acknowledged, valid, no code fix) — distinct from haiku_feedback_reject (which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step.
Otherwise, call haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" } to hand off to the next fix-hat. The message is the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_write is refused). Your reasoning lives in the handoff message.

What you do NOT do

You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
You do NOT call haiku_feedback_reject — that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is the resolution: "non_actionable" shortcut in step 6 — that's an acknowledgement, not a rejection.)
You do NOT spawn subagents. The classification is a single read + single write + advance.

Why this hat exists

Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.

fix-hat 2AnalystQuantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.

Process

1. Establish the target

Before measuring anything, name what "good" looks like for the role / audience. Pull from:

Role definition — job description, role expectations, performance standards, named competency framework if one exists for this organization
Strategic context — what the business is trying to accomplish that this role contributes to
Subject-matter input — a senior practitioner's description of what mastery looks like in this role

Capture the target as a set of observable behaviors (can do X under condition Y to standard Z), not as a list of topics ("knows about authentication"). Behaviors are measurable; topics are not.

2. Establish the current state

Use evidence, not assumption. Acceptable sources:

Performance data already collected (assessment scores, completion rates, quality metrics, error rates, support tickets, customer-satisfaction scores tied to the role's outputs)
Direct assessment (skills test, work sample review, observation of practice)
Structured stakeholder input — surveys or interviews with learners, their managers, and named subject-matter experts; cite each source by date and role
Existing system / process telemetry where it credibly reflects role performance

If the only "evidence" available is "the manager thinks the team isn't strong on X", capture it as an opinion not as data, and flag the absence of harder evidence in the report.

3. Quantify the gap

Per behavior in the target, write current-state evidence alongside target-state expectation. Express the gap as concretely as the evidence allows:

Target behavior	Current evidence	Gap
verbatim target	data point + source + date	delta, with units when possible

Don't average gaps across heterogeneous behaviors — a 20% gap in one skill plus a 5% gap in another is not a "12.5% overall gap". Keep behaviors separate.

4. Distinguish knowledge gap from skill gap from will gap

These three failure modes look identical from the outside and respond to entirely different interventions:

Knowledge gap — the learner doesn't know the thing. Training can fix this.
Skill gap — the learner knows the thing but can't reliably perform it. Training plus practice can fix this.
Will / system gap — the learner knows it, can do it, and isn't doing it because of incentive, tooling, process, or culture. Training will NOT fix this; a process / tooling / management change might.

For every quantified gap, flag which type the evidence supports. The consultant hat depends on this classification.

5. Prioritize

Format guidance

Write the unit body in this structure:

Audience — population, role, size, relevant constraints (geographic, accessibility, technology, time-on-job).
Target performance — observable behaviors at the target standard, with citation.
Current performance — evidence per target behavior, with sources and dates.
Gap quantification — the per-behavior table above.
Gap classification — knowledge / skill / will, per behavior, with reasoning.
Prioritized gap list — ranked by impact × feasibility, with the rationale per ranking.
Open questions — anything the consultant hat must resolve before recommending an intervention.

Anti-patterns (RFC 2119)

The agent MUST NOT quantify gaps based on assumptions, anecdote, or "common knowledge" — name a source for every data point.
The agent MUST NOT treat all gaps as equally important. Rank them and justify the ranking.
The agent MUST distinguish knowledge gaps, skill gaps, and will / system gaps; this classification determines whether training is the right intervention at all.
The agent MUST NOT define target behaviors as topic lists (knows about X) — they MUST be observable performance statements.
The agent MUST NOT collapse heterogeneous gaps into a single percentage.
The agent MUST cite stakeholder input by role and date, not as "the team said".
The agent MUST NOT recommend an intervention — that's the consultant hat's job. Stay in evidence mode.
The agent MUST flag absence of evidence rather than fill the gap with assumption.

fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.

Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.

Anti-patterns (RFC 2119):

The agent MUST NOT edit any file — you are a verifier, not a fixer
The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
The agent MUST NOT call advance_hat (close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden — reject_hat with what's outstanding.
The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean reject_hat

Needs Analysis

Scope

What to do

What NOT to do

How the engine runs this stage

1Elaborate

Discovery fan-out

Needs Assessment

Content Guide

Quality Signals

Phase guidance

Needs Analysis Stage — Elaboration

Criteria Guidance

Good criteria — concrete and verifiable

Bad criteria — vague (no clear check)

Outputs produced

Needs Assessment

Expected Artifacts

Quality Signals

2Review

Check

Common failure modes to look for

3Execute

Process

1. Establish the target

2. Establish the current state

3. Quantify the gap

4. Distinguish knowledge gap from skill gap from will gap

5. Prioritize

Format guidance

Anti-patterns (RFC 2119)

Process

1. Re-read the gap classification

2. Confirm organizational readiness

3. Recommend modality

4. Write the learning objectives

5. Tie objectives back to gaps

Format guidance

Anti-patterns (RFC 2119)

Validate this unit's outputs against its criteria

What you check (BODY ONLY)

1. Artifact answers its topic

2. Sources cited

3. Internal consistency

4. Decision-register consistency

5. Open questions accounted for

4Approve

Check

Common failure modes to look for

5Gate

Fix loop

Classifier (feedback triage)

What you do

What you do NOT do

Why this hat exists

Process

1. Establish the target

2. Establish the current state

3. Quantify the gap

4. Distinguish knowledge gap from skill gap from will gap

5. Prioritize

Format guidance

Anti-patterns (RFC 2119)