Needs Analysis
Auto gateConduct skills gap analysis and define learning objectives
Needs Analysis
The upstream knowledge stage for the entire training lifecycle: establish whether training is even the right intervention, who needs it, and what they must be able to do at the end. If the needs are wrong here, every later stage delivers the wrong program well.
Scope
Diagnosing the gap and defining objectives: performance data, audience profile, competency baseline, the gap between current and target performance, and whether training (vs process, tooling, or hiring) is the right lever. Needs-analysis decides what problem the program must solve and for whom — not how the curriculum is structured (design) or built (develop).
What to do
- Quantify the gap between current and target performance with real data, not assumed deficiency.
- Confirm training is actually the right lever before recommending a program — sometimes the answer is process or tooling.
- Profile the audience well enough that design can make modality and sequencing choices.
- Write learning objectives concrete enough that evaluate can later measure against them.
What NOT to do
- Don't design curriculum structure, module sequence, or assessment strategy — that's design.
- Don't author materials or build content.
- Don't recommend training when the evidence points to a non-training cause of the gap.
- Don't leave the target outcome so vague that no later stage can check against it.
How the engine runs this stage
1Elaborate
collaborative · plan the work, fan out discovery, declare outputsDiscovery fan-out
knowledge artifactNeeds AssessmentSkills gap analysis, learning objectives, and target audience profile.
Needs Assessment
Skills gap analysis, learning objectives, and target audience profile.
Content Guide
Structure the assessment for curriculum design:
- Current state -- existing competency levels across target roles and skills
- Target state -- required competency levels with business justification
- Gap analysis -- quantified gaps between current and target with priority ranking
- Learning objectives -- specific, observable objectives following Bloom's taxonomy
- Target audience -- demographics, prior knowledge, learning preferences, and constraints
- Intervention validation -- confirmation that training is the appropriate solution
Quality Signals
- Gaps are quantified using data (assessments, performance metrics, surveys)
- Learning objectives use observable verbs with measurable outcomes
- Training is validated as the right intervention, not assumed
- Stakeholder input includes learners, managers, and subject matter experts
Phase guidance
phase overrideELABORATION- "Skills gap analysis maps current competency levels against target levels for each role with measurable gaps"
Needs Analysis Stage — Elaboration
Criteria Guidance
Good criteria — concrete and verifiable
- "Skills gap analysis maps current competency levels against target levels for each role with measurable gaps"
- "Learning objectives follow Bloom's taxonomy with specific, observable verbs and measurable outcomes"
- "Needs assessment includes stakeholder input from at least 3 sources: learners, managers, and subject matter experts"
Bad criteria — vague (no clear check)
- "Needs are assessed"
- "Gaps are identified"
- "Objectives are defined"
Outputs produced
output templateNeeds AssessmentSkills gap analysis with quantified gaps, learning objectives, and audience profile.
Needs Assessment
Skills gap analysis with quantified gaps, learning objectives, and audience profile.
Expected Artifacts
- Skills gap analysis -- current vs target competency levels for each role with measurable gaps
- Learning objectives -- following Bloom's taxonomy with observable verbs and measurable outcomes
- Audience profile -- target learners profiled with existing knowledge and learning preferences
- Stakeholder input -- validated needs from learners, managers, and subject matter experts
Quality Signals
- Gap measurements are validated against multiple data sources
- Learning objectives use specific, observable verbs from Bloom's taxonomy
- Objectives align with business goals and are achievable within constraints
- At least 3 stakeholder sources provide input (learners, managers, SMEs)
2Review
pre-execute · agents audit the planned spec before any code landsreview agentValidityThe agent **MUST** verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.
Mandate: The agent MUST verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.
Check
The agent MUST verify and file feedback for any violation:
- Evidence behind every gap — every quantified gap MUST cite a specific source (data point, dated stakeholder input by role, named system telemetry). Gaps backed only by "the team thinks" or "common knowledge" are findings.
- Knowledge / skill / will classification — every gap MUST be classified as knowledge, skill, or will / system. Unclassified gaps mean the consultant hat couldn't credibly choose an intervention.
- Intervention check — training MUST be explicitly confirmed as the right lever versus process change, tooling change, hiring, or management coaching. A needs assessment that recommends training without this check is a finding.
- Audience profile completeness — the audience MUST be profiled with population, role, size, and the constraints that affect modality choice (geographic distribution, accessibility, time-on-job, technology access).
- Bloom-aligned learning objectives — every objective MUST use an action verb that names a concrete observable behavior.
Understand,know,be aware of, andcoverare findings, not objectives. - Gap-to-objective trace — every priority-1 gap MUST have at least one objective covering it; every objective MUST trace back to a specific gap. Orphan objectives and unaddressed gaps are both findings.
- Organizational readiness — the assessment MUST address whether managers will reinforce the new behavior, whether learners have the conditions to apply it, and whether the surrounding system supports the change.
Common failure modes to look for
- A target performance defined as a topic list (
knows about X) rather than observable behavior. - A gap quantified as a single percentage averaged across heterogeneous skills.
- A will / system gap with a training recommendation attached anyway.
- A modality recommendation that's not justified against the audience profile.
- A learning objective that names a topic (
databases) instead of a behavior (design a normalized schema for a transactional workload). - A needs assessment that confirms the gap but doesn't address whether the system around the learner will support new behavior.
3Execute
per-unit baton · Analyst → Consultant → Verifierhat 1AnalystQuantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.
Focus: Quantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.
Process
1. Establish the target
Before measuring anything, name what "good" looks like for the role / audience. Pull from:
- Role definition — job description, role expectations, performance standards, named competency framework if one exists for this organization
- Strategic context — what the business is trying to accomplish that this role contributes to
- Subject-matter input — a senior practitioner's description of what mastery looks like in this role
Capture the target as a set of observable behaviors (can do X under condition Y to standard Z), not as a list of topics ("knows about authentication"). Behaviors are measurable; topics are not.
2. Establish the current state
Use evidence, not assumption. Acceptable sources:
- Performance data already collected (assessment scores, completion rates, quality metrics, error rates, support tickets, customer-satisfaction scores tied to the role's outputs)
- Direct assessment (skills test, work sample review, observation of practice)
- Structured stakeholder input — surveys or interviews with learners, their managers, and named subject-matter experts; cite each source by date and role
- Existing system / process telemetry where it credibly reflects role performance
If the only "evidence" available is "the manager thinks the team isn't strong on X", capture it as an opinion not as data, and flag the absence of harder evidence in the report.
3. Quantify the gap
Per behavior in the target, write current-state evidence alongside target-state expectation. Express the gap as concretely as the evidence allows:
| Target behavior | Current evidence | Gap |
|---|---|---|
| verbatim target | data point + source + date | delta, with units when possible |
Don't average gaps across heterogeneous behaviors — a 20% gap in one skill plus a 5% gap in another is not a "12.5% overall gap". Keep behaviors separate.
4. Distinguish knowledge gap from skill gap from will gap
These three failure modes look identical from the outside and respond to entirely different interventions:
- Knowledge gap — the learner doesn't know the thing. Training can fix this.
- Skill gap — the learner knows the thing but can't reliably perform it. Training plus practice can fix this.
- Will / system gap — the learner knows it, can do it, and isn't doing it because of incentive, tooling, process, or culture. Training will NOT fix this; a process / tooling / management change might.
For every quantified gap, flag which type the evidence supports. The consultant hat depends on this classification.
5. Prioritize
Stack-rank the gaps by business impact × learning feasibility. A high-impact gap that the audience can plausibly close with training is the highest priority. A high-impact gap that's actually a process gap goes to the consultant for a non-training recommendation, not to the priority list.
Format guidance
Write the unit body in this structure:
- Audience — population, role, size, relevant constraints (geographic, accessibility, technology, time-on-job).
- Target performance — observable behaviors at the target standard, with citation.
- Current performance — evidence per target behavior, with sources and dates.
- Gap quantification — the per-behavior table above.
- Gap classification — knowledge / skill / will, per behavior, with reasoning.
- Prioritized gap list — ranked by impact × feasibility, with the rationale per ranking.
- Open questions — anything the consultant hat must resolve before recommending an intervention.
Anti-patterns (RFC 2119)
- The agent MUST NOT quantify gaps based on assumptions, anecdote, or "common knowledge" — name a source for every data point.
- The agent MUST NOT treat all gaps as equally important. Rank them and justify the ranking.
- The agent MUST distinguish knowledge gaps, skill gaps, and will / system gaps; this classification determines whether training is the right intervention at all.
- The agent MUST NOT define target behaviors as topic lists (
knows about X) — they MUST be observable performance statements. - The agent MUST NOT collapse heterogeneous gaps into a single percentage.
- The agent MUST cite stakeholder input by role and date, not as "the team said".
- The agent MUST NOT recommend an intervention — that's the consultant hat's job. Stay in evidence mode.
- The agent MUST flag absence of evidence rather than fill the gap with assumption.
hat 2ConsultantInterpret the analyst's quantified gap, confirm whether training is the right intervention, and — if it is — recommend modality, intensity, and the learning objectives that frame the design stage. You are the do role for needs analysis. The analyst hands you evidence; you hand the design stage a recommended intervention with named learning objectives.
Focus: Interpret the analyst's quantified gap, confirm whether training is the right intervention, and — if it is — recommend modality, intensity, and the learning objectives that frame the design stage. You are the do role for needs analysis. The analyst hands you evidence; you hand the design stage a recommended intervention with named learning objectives.
Process
1. Re-read the gap classification
Start by checking the analyst's gap classification (knowledge / skill / will). The single most common failure of training programs is solving a will / system gap with a course. Before recommending training, confirm:
- Is the gap a knowledge gap? Training is plausible.
- Is the gap a skill gap? Training plus structured practice is plausible.
- Is the gap a will / system gap? Recommend a NON-training intervention (process change, tooling, management coaching, incentive change). The training studio is the wrong lifecycle for this — your output is the recommendation to stop, not a curriculum.
If you find evidence the analyst's classification is wrong, file an internal note and reject — don't paper over it by recommending training anyway.
2. Confirm organizational readiness
Even when training is the right lever, it can fail because the organization isn't ready to absorb it:
- Will managers reinforce the new behavior post-training, or undermine it?
- Do learners have the time / tooling / authority to apply what they learn?
- Is the system that produces the gap (escalation paths, tooling, incentives) going to support the new behavior?
If the answer to any of these is "no", note it in the recommendation. Training delivered into a hostile system has near-zero transfer to job, regardless of the program's quality.
3. Recommend modality
Given the audience profile from the analyst, choose the delivery modality. The dimensions to consider, in priority order:
- Synchronous vs. asynchronous — does the content need real-time feedback, peer interaction, facilitator adaptation? Or is it self-paced reinforcement?
- In-person vs. remote — is hands-on practice with physical equipment / co-located peers required, or does the content travel?
- Self-paced vs. cohort — does learning benefit from peer comparison and group accountability, or is variable time-to-mastery more important?
- Blended — combination of the above, with named handoff points.
Justify the choice against the audience's working pattern, geographic distribution, accessibility needs, and the nature of the skill being built. "We always use [generic modality category]" is not a justification.
4. Write the learning objectives
Learning objectives are the spec the design stage consumes. Write them to Bloom's taxonomy — the action verb names the cognitive level (identify, apply, analyze, evaluate, create). Each objective is one sentence:
By the end of [program / module], [audience] will be able to [observable behavior at the targeted Bloom level], under [condition], to [standard].
Anti-shape to avoid:
Participants will understand X. Learners will be aware of Y. The course covers Z.
Understand, know, be aware of, and cover are not measurable. Replace them with concrete action verbs aligned to the cognitive level the gap requires.
5. Tie objectives back to gaps
Every objective MUST trace to a specific gap in the analyst's quantified list. Every priority-1 gap from the analyst MUST have at least one objective covering it. Surface any mismatch — extra objectives without a backing gap, or gaps without an objective.
Format guidance
Write the unit body in this structure:
- Intervention recommendation —
training/not training (named alternative)/training + named adjacent intervention. One sentence, then the reasoning. - Readiness assessment — managerial reinforcement, learner conditions, system support. Note any caveat that could undermine transfer.
- Modality recommendation — synchronous / asynchronous / in-person / remote / cohort / self-paced / blended, with justification anchored to the audience profile.
- Learning objectives — Bloom-aligned, one per line, traceable to a specific gap.
- Gap-to-objective trace — table with
gap → objective(s)mapping. - Open questions — escalations the design stage will need answered.
Anti-patterns (RFC 2119)
- The agent MUST NOT recommend training as the solution for every performance gap. Will / system gaps get non-training recommendations.
- The agent MUST confirm organizational readiness; a brilliant program delivered into a hostile system fails.
- The agent MUST write learning objectives using Bloom-aligned action verbs that name a concrete observable behavior at the cognitive level the gap requires.
- The agent MUST NOT use
understand,know,be aware of, orcover— these are not measurable. - The agent MUST trace every learning objective back to a specific gap from the analyst's evidence.
- The agent MUST NOT justify modality by team habit or convenience — justify against audience and skill nature.
- The agent MUST NOT design the curriculum here; that's the design stage. Your output is the spec the design stage consumes.
- The agent MUST flag missing prerequisites or unanswered open questions explicitly rather than paper over them.
hat 3VerifierValidate the per-unit knowledge artifact for the needs-analysis stage of training. Units here are training need — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Focus: Validate the per-unit knowledge artifact for the needs-analysis stage of training. Units here are training need — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Anti-patterns (RFC 2119):
- The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
- The agent MUST NOT validate against frontmatter schema,
depends_on:resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities. - The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
- The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
- The agent MUST name a specific failed criterion in any rejection.
- The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.
Validate this unit's outputs against its criteria
List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.
What you check (BODY ONLY)
1. Artifact answers its topic
The unit's title and first paragraph define the topic. The remaining body MUST deliver substantive content on that topic. Reject placeholders, content-free outlines, or redirects.
2. Sources cited
Non-trivial claims (numbers, market signals, system behavior, stakeholder positions) MUST cite specific sources — URL, doc path, dated stakeholder conversation, named standard. Reject "industry common knowledge" or unsourced numerical claims.
3. Internal consistency
Title, mission, and body must align. Numerical/categorical claims must be consistent across the body. Recommendations must follow from the evidence presented.
4. Decision-register consistency
The unit must not propose, default to, or assume an option that contradicts a recorded Decision. Cite the Decision ID in any rejection.
5. Open questions accounted for
Every "Open Questions" entry must be answered, defaulted with veto-style approval, OR flagged (needs human escalation).
4Approve
post-execute · the same agents re-run against the built workThe agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.
approval agentValidityThe agent **MUST** verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.
Mandate: The agent MUST verify that the needs assessment is evidence-based, that training is confirmed as the right intervention, and that learning objectives are written in a way the design stage can actually execute against. Findings filed by this lens become the load-bearing input for everything downstream — a soft needs assessment produces soft programs.
Check
The agent MUST verify and file feedback for any violation:
- Evidence behind every gap — every quantified gap MUST cite a specific source (data point, dated stakeholder input by role, named system telemetry). Gaps backed only by "the team thinks" or "common knowledge" are findings.
- Knowledge / skill / will classification — every gap MUST be classified as knowledge, skill, or will / system. Unclassified gaps mean the consultant hat couldn't credibly choose an intervention.
- Intervention check — training MUST be explicitly confirmed as the right lever versus process change, tooling change, hiring, or management coaching. A needs assessment that recommends training without this check is a finding.
- Audience profile completeness — the audience MUST be profiled with population, role, size, and the constraints that affect modality choice (geographic distribution, accessibility, time-on-job, technology access).
- Bloom-aligned learning objectives — every objective MUST use an action verb that names a concrete observable behavior.
Understand,know,be aware of, andcoverare findings, not objectives. - Gap-to-objective trace — every priority-1 gap MUST have at least one objective covering it; every objective MUST trace back to a specific gap. Orphan objectives and unaddressed gaps are both findings.
- Organizational readiness — the assessment MUST address whether managers will reinforce the new behavior, whether learners have the conditions to apply it, and whether the surrounding system supports the change.
Common failure modes to look for
- A target performance defined as a topic list (
knows about X) rather than observable behavior. - A gap quantified as a single percentage averaged across heterogeneous skills.
- A will / system gap with a training recommendation attached anyway.
- A modality recommendation that's not justified against the audience profile.
- A learning objective that names a topic (
databases) instead of a behavior (design a normalized schema for a transactional workload). - A needs assessment that confirms the gap but doesn't address whether the system around the learner will support new behavior.
5Gate
controls advancement to the next stageThe harness advances automatically — no human in the loop at this gate.
Fix loop
a separate track · Classifier → Analyst → Feedback AssessorNot a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.
fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's
Classifier (feedback triage)
You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.
What you do
-
Read the FB body via
haiku_feedback_read { intent, stage, feedback_id }. -
Read the stage's unit list via
haiku_unit_list { intent, stage }. -
Decide:
target_unit— which unit this FB counter-signals.- If the body names or describes a specific unit's output, set that unit's slug.
- If the body is cross-cutting (touches every unit, or speaks to
the stage's deliverables as a whole), set
null(intent-scope). - When in doubt:
null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
target_invalidates— which approval roles get cleared on closure. Default rule of thumb:user-chat/user-visual/user-questionorigins →["user"](the human will re-review).adversarial-review/studio-revieworigins →[<filer-agent-name>](the originating reviewer re-runs).driftorigin →["user"](drift always escalates to human).agentorigin →[](informational; no rerun).
-
Call
haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes thetarget_unit/target_invalidatesrouting only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance. -
Decide severity and call
haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returnsseverity_already_setand you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
-
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only
reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself:haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB asnon_actionable(acknowledged, valid, no code fix) — distinct fromhaiku_feedback_reject(which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step. -
Otherwise, call
haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" }to hand off to the next fix-hat. Themessageis the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_writeis refused). Your reasoning lives in the handoffmessage.
What you do NOT do
- You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
- You do NOT call
haiku_feedback_reject— that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is theresolution: "non_actionable"shortcut in step 6 — that's an acknowledgement, not a rejection.) - You do NOT spawn subagents. The classification is a single read + single write + advance.
Why this hat exists
Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.
fix-hat 2AnalystQuantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.
Focus: Quantify the skills / performance gap between current state and target state for the audience in scope. You are the plan role — you assemble the evidence the consultant hat will interpret. Your output is data and a defensible baseline, not a recommendation.
Process
1. Establish the target
Before measuring anything, name what "good" looks like for the role / audience. Pull from:
- Role definition — job description, role expectations, performance standards, named competency framework if one exists for this organization
- Strategic context — what the business is trying to accomplish that this role contributes to
- Subject-matter input — a senior practitioner's description of what mastery looks like in this role
Capture the target as a set of observable behaviors (can do X under condition Y to standard Z), not as a list of topics ("knows about authentication"). Behaviors are measurable; topics are not.
2. Establish the current state
Use evidence, not assumption. Acceptable sources:
- Performance data already collected (assessment scores, completion rates, quality metrics, error rates, support tickets, customer-satisfaction scores tied to the role's outputs)
- Direct assessment (skills test, work sample review, observation of practice)
- Structured stakeholder input — surveys or interviews with learners, their managers, and named subject-matter experts; cite each source by date and role
- Existing system / process telemetry where it credibly reflects role performance
If the only "evidence" available is "the manager thinks the team isn't strong on X", capture it as an opinion not as data, and flag the absence of harder evidence in the report.
3. Quantify the gap
Per behavior in the target, write current-state evidence alongside target-state expectation. Express the gap as concretely as the evidence allows:
| Target behavior | Current evidence | Gap |
|---|---|---|
| verbatim target | data point + source + date | delta, with units when possible |
Don't average gaps across heterogeneous behaviors — a 20% gap in one skill plus a 5% gap in another is not a "12.5% overall gap". Keep behaviors separate.
4. Distinguish knowledge gap from skill gap from will gap
These three failure modes look identical from the outside and respond to entirely different interventions:
- Knowledge gap — the learner doesn't know the thing. Training can fix this.
- Skill gap — the learner knows the thing but can't reliably perform it. Training plus practice can fix this.
- Will / system gap — the learner knows it, can do it, and isn't doing it because of incentive, tooling, process, or culture. Training will NOT fix this; a process / tooling / management change might.
For every quantified gap, flag which type the evidence supports. The consultant hat depends on this classification.
5. Prioritize
Stack-rank the gaps by business impact × learning feasibility. A high-impact gap that the audience can plausibly close with training is the highest priority. A high-impact gap that's actually a process gap goes to the consultant for a non-training recommendation, not to the priority list.
Format guidance
Write the unit body in this structure:
- Audience — population, role, size, relevant constraints (geographic, accessibility, technology, time-on-job).
- Target performance — observable behaviors at the target standard, with citation.
- Current performance — evidence per target behavior, with sources and dates.
- Gap quantification — the per-behavior table above.
- Gap classification — knowledge / skill / will, per behavior, with reasoning.
- Prioritized gap list — ranked by impact × feasibility, with the rationale per ranking.
- Open questions — anything the consultant hat must resolve before recommending an intervention.
Anti-patterns (RFC 2119)
- The agent MUST NOT quantify gaps based on assumptions, anecdote, or "common knowledge" — name a source for every data point.
- The agent MUST NOT treat all gaps as equally important. Rank them and justify the ranking.
- The agent MUST distinguish knowledge gaps, skill gaps, and will / system gaps; this classification determines whether training is the right intervention at all.
- The agent MUST NOT define target behaviors as topic lists (
knows about X) — they MUST be observable performance statements. - The agent MUST NOT collapse heterogeneous gaps into a single percentage.
- The agent MUST cite stakeholder input by role and date, not as "the team said".
- The agent MUST NOT recommend an intervention — that's the consultant hat's job. Stay in evidence mode.
- The agent MUST flag absence of evidence rather than fill the gap with assumption.
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.
Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.
Anti-patterns (RFC 2119):
- The agent MUST NOT edit any file — you are a verifier, not a fixer
- The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
- The agent MUST NOT call
advance_hat(close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden —reject_hatwith what's outstanding. - The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
- The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
- The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean
reject_hat