Research
Auto gateIdentify target audience, map the topic landscape, analyze competitive content
Research
The opening stage of the dev-evangelism lifecycle: turn a raw evangelism intent into a grounded understanding of who the audience is and what they care about. Every later stage — narrative, create, publish, measure — reads this stage's map to know who it's writing for and about.
Scope
Mapping the audience and the topic landscape: developer segments and skill levels, the topics they engage with, where they gather, and where the team has credible expertise to contribute. Research decides who and what — it does not shape the story (narrative), produce assets (create), or distribute anything (publish).
What to do
- Read prior community signals, past content history, and the intent's stated audience hypothesis before forming your own.
- Map the developer segments, their skill levels, and how they actually behave on each platform.
- Find the trending threads, the underserved gaps, and the competitive content already in the space.
- Check honestly where the team has credible expertise to contribute, and where it doesn't.
What NOT to do
- Don't draft story arcs, hooks, or takeaways — that's the narrative stage.
- Don't produce content assets or demos — that's create.
- Don't assert an audience or topic claim you can't ground in a signal.
- Don't pick a topic the team can't credibly speak to.
How the engine runs this stage
1Elaborate
collaborative · plan the work, fan out discovery, declare outputsDiscovery fan-out
knowledge artifactAudience LandscapeDeveloper audience research and topic landscape analysis. This output feeds the narrative stage as foundational context for story arc and messaging decisions.
Audience Landscape
Developer audience research and topic landscape analysis. This output feeds the narrative stage as foundational context for story arc and messaging decisions.
Content Guide
Structure the landscape around developer understanding:
- Developer segments -- defined with skill levels, technology stacks, pain points, and content format preferences
- Topic landscape -- trending themes, underserved areas, and competitive content analysis
- Content gaps -- opportunities where the team's expertise fills a genuine need
- Community presence -- where each segment is active and receptive (forums, platforms, events)
- Sources consulted -- community data, analytics, conference programs, with retrieval dates
- Open questions -- what remains unvalidated or requires direct audience feedback
Quality Signals
- Developer segments are evidence-based, not assumed from job titles alone
- Topic recommendations match audience needs and team credibility
- Content gaps are validated against existing competitive content
- Community mapping identifies specific platforms and forums, not generic categories
Phase guidance
phase overrideELABORATION- "Audience landscape identifies at least 3 developer segments with skill levels, pain points, and preferred content formats"
Research Stage — Elaboration
Criteria Guidance
Good criteria — concrete and verifiable
- "Audience landscape identifies at least 3 developer segments with skill levels, pain points, and preferred content formats"
- "Topic scan surfaces at least 5 trending or underserved topics with competitive content analysis for each"
- "Research brief maps existing content gaps where the team has unique expertise to contribute"
Bad criteria — vague (no clear check)
- "Research is done"
- "Audience is understood"
- "Topics are identified"
Outputs produced
output templateResearch BriefDeveloper audience segments, topic landscape, and content gap analysis.
Research Brief
Developer audience segments, topic landscape, and content gap analysis.
Expected Artifacts
- Developer segments -- defined with skill levels, technology stacks, pain points, and content preferences
- Topic landscape -- trending and underserved topics with competitive content analysis
- Content gaps -- opportunities where the team has unique expertise to contribute
- Community map -- where target developers are active and receptive
Quality Signals
- Developer segments include skill levels, pain points, and content format preferences
- At least 5 topics are analyzed with competitive content assessment
- Content gaps are validated against existing materials in the space
- All claims reference specific data sources with retrieval dates
2Review
pre-execute · agents audit the planned spec before any code landsreview agentRelevanceThe agent **MUST** verify that the research stage's audience segmentation and topic landscape target genuine developer needs that this team has credibility to address. Files feedback on any violation; does NOT edit the research artifacts.
Mandate: The agent MUST verify that the research stage's audience segmentation and topic landscape target genuine developer needs that this team has credibility to address. Files feedback on any violation; does NOT edit the research artifacts.
Check
The agent MUST verify each of the following and file feedback for any miss:
- Segment evidence — every audience segment in the landscape is grounded in observable behavior (forum activity, analytics, conference programs, stakeholder interviews with dates) — not in job title alone or in assumption
- Segment behavior split — the landscape distinguishes builders (developers who ship with the technology) from evaluators (developers deciding whether to adopt); collapsing both into one segment is a gap
- Topic-audience match — each recommended topic maps to at least one named segment from the audience landscape; topics with no matching segment are scope creep
- Team credibility check — each recommended topic names the specific prior work, contributors, or expertise that justifies the team publishing on it; topics flagged
(credibility gap)get surfaced to the user, not silently dropped - Saturation analysis — for each recommended topic, the competitive content landscape is described with sources cited, not just summarized; an "underserved" claim with no comparison is unsupported
- Timeliness window — each topic carries a stance on whether it's ascending, at peak, or past peak; past-peak high-saturation topics that aren't explicitly flagged are findings
Common failure modes to look for
- Job-title-only segmentation ("senior engineers") with no behavior context
- Channel claims that name no source ("developers are active on X" with no citation)
- Topics ranked without a visible ranking method (or with a ranking that contradicts the demand and credibility evidence)
- Demand signals cited as "trending" without a date window or volume context
- Rejection candidates dropped silently rather than listed with the failing test named
- Audience-size or community-volume figures presented as fact without a source
3Execute
per-unit baton · Audience Analyst → Topic Scout → Verifierhat 1Audience AnalystMap the developer audience for this evangelism intent — segments, skill levels, technology stacks, pain points, content-consumption habits, and the platforms where each segment is genuinely active. The audience map is the grounding every later stage references when deciding what to write, where to publish, and what to measure. Generic "all developers" segmentation produces generic content that converts no one.
Focus: Map the developer audience for this evangelism intent — segments, skill levels, technology stacks, pain points, content-consumption habits, and the platforms where each segment is genuinely active. The audience map is the grounding every later stage references when deciding what to write, where to publish, and what to measure. Generic "all developers" segmentation produces generic content that converts no one.
Process
1. Pre-flight — confirm grounding before segmenting
Before drafting segments, surface what you already have and what you're assuming. Confirm with the user:
- Stated audience hypothesis — who the intent claims to target, in the intent's own words
- Prior content history — any evangelism work this team has shipped before, and what landed / didn't
- Available community signals — discussion forums, code-repo activity, analytics, conference programs, podcast charts, newsletters the team can read
- Existing personas / segmentations — anything an internal team has already produced that this work should match
- Team credibility — what THIS team is actually known for; segments outside that credibility window will produce content that rings false
Where the user can't confirm a signal source, mark the corresponding part of the map as (unvalidated — needs follow-up) rather than inventing data.
2. Define segments by behavior, not by job title
The single biggest segmentation failure is collapsing "developers" into one audience or splitting by job title alone. A "Senior Engineer at a startup who ships every day" consumes content differently from a "Senior Engineer at an enterprise on a legacy stack." Same title, different segment.
For each candidate segment, capture:
| Attribute | What goes here |
|---|---|
| Segment name | Behavior-grounded label (e.g. "Backend engineers shipping greenfield services") — NOT "senior engineers" |
| Skill level | Beginner / intermediate / advanced relative to the topic, with the evidence that justifies the classification |
| Technology context | The stack / runtime / language cluster the segment lives in |
| Top pain points | 3-5 problems THIS segment actually has, sourced from forum threads, surveys, or stakeholder interviews |
| Content formats they consume | Written long-form, written short-form, video, audio, conference talks, interactive code, etc. — with the evidence |
| Channels they're active on | Generic channel categories (developer Q&A forums, code-host social, video platforms, technical podcasts, regional meetups, specific conference circuits, newsletters) — never invent platform names |
| Build vs. evaluate posture | Are they hands-on with the technology, or evaluating whether to adopt? Different content fits each. |
3. Cross-check against team credibility
For each candidate segment, ask: does the team have genuine credibility to publish to this audience? If yes, write the evidence (prior shipped work, public artifacts, named contributors). If no, mark the segment (credibility gap) and surface it to the user — covering this segment may require partnering, co-publishing, or scoping the intent down.
4. Map open questions
For every segment, list what you couldn't validate from available signals. These become the topic-scout's research targets — questions to answer through additional scanning OR escalations to the user for direct audience research.
5. Hand off
Hand off when:
- Every segment is named with a behavior-grounded label, not a job title alone
- Every segment has a populated row across all attribute columns
- Every claim cites a specific signal source (forum thread, analytics export, stakeholder interview with date)
- Open questions are listed with the responsible follow-up
Append the structured map to the unit body and append the corresponding section of the intent-scope AUDIENCE-LANDSCAPE.md knowledge artifact.
Anti-patterns (RFC 2119)
- The agent MUST NOT define developer segments solely by job title; behavior + technology context is the contract
- The agent MUST NOT assume content preferences without evidence from observable community behavior
- The agent MUST NOT conflate beginner and advanced audiences into a single "developers" segment
- The agent MUST NOT reference specific named third-party platforms in the segment map (use channel categories like "developer Q&A forum", "code-host social", "video platform" — overlays add named platforms)
- The agent MUST NOT invent statistics, audience-size numbers, or community-volume figures; cite the source or leave the value as
(unvalidated) - The agent MUST distinguish between developers who build with a technology and those who evaluate it; different posture, different content
- The agent MUST cross-check every segment against team credibility and flag gaps explicitly
- The agent MUST preserve every open question as a follow-up rather than silently dropping it
hat 2Topic ScoutScan the technical landscape for topics this audience cares about and where the team has credible expertise to contribute. Produce a ranked topic landscape — trending threads, underserved gaps, competitive-content snapshots, and a credibility check per topic. The audience-analyst said WHO; topic-scout says WHAT to talk to them about.
Focus: Scan the technical landscape for topics this audience cares about and where the team has credible expertise to contribute. Produce a ranked topic landscape — trending threads, underserved gaps, competitive-content snapshots, and a credibility check per topic. The audience-analyst said WHO; topic-scout says WHAT to talk to them about.
Process
1. Read your inputs
- The audience-analyst's segment map for this unit (
haiku_unit_readon the upstream unit, plus the corresponding section of the intent-scopeAUDIENCE-LANDSCAPE.mdknowledge artifact) - The intent's stated topic hypothesis, if any
- Sibling research units' topic candidates so the scan doesn't duplicate
2. Scan by channel category, not by named platform
Walk the channel categories the audience-analyst identified as active for the target segments. For each category, look for:
- Trending threads — what's getting volume and recent activity from THIS segment, with a defensible relevance window (e.g., past 90 days)
- Underserved gaps — questions getting asked repeatedly with no canonical answer, or answers that are out of date
- Saturation flags — topics where competing high-quality content already exists; a new entry needs a clear unique angle
- Competitive content — what the most-referenced sources in this segment are publishing; the team's content has to compete on substance, not just exist
Generic channel categories (rather than named platforms) keep the plugin default portable. Project overlays add specific platforms (the developer Q&A forum the team monitors, the conference circuit it submits to) without modifying the plugin defaults.
3. Build the topic ranking
For each topic candidate, capture:
| Attribute | What goes here |
|---|---|
| Topic | Concrete, scoped statement of what the content would cover — NOT a broad area like "performance" |
| Target segment(s) | Which audience-analyst segments this topic serves; reject any topic without at least one match |
| Demand signal | Specific evidence the audience wants this (forum threads, search trends, conference program data, podcast queries) with dates |
| Competitive landscape | Who else is covering it well; what gap or unique angle this team can credibly fill |
| Team credibility | The specific prior work, contributors, or expertise that makes the team credible to publish on this |
| Timeliness | Is the topic still ascending, at peak, or past peak? Past-peak topics with high saturation are rejection candidates |
| Recommended format(s) | Long-form written, short-form written, video, audio, talk, demo, interactive — based on what the segment consumes |
Rank topics by (demand signal × credibility) ÷ (saturation × past-peak penalty). The output is an ordered list, not an unordered list.
4. Flag rejection candidates explicitly
Topics that fail one of the four hard tests (no matching segment, no demand signal, no team credibility, saturated and past peak) MUST be listed in a ## Rejection Candidates section with the failing test named. Surfacing rejected topics is signal: it shows the user what was considered and ruled out, which is more useful than a silent shortlist.
5. Hand off
Hand off when:
- Each surviving topic has a populated row across every attribute column
- Each demand signal cites a specific source with a date
- Each competitive-content claim names the sources or analyses being cited
- Each credibility claim cites the team's prior work, named contributors, or domain history
- A ranked list exists with the ranking method visible
Append the topic landscape to the unit body and to the corresponding section of AUDIENCE-LANDSCAPE.md.
Anti-patterns (RFC 2119)
- The agent MUST NOT recommend topics where the team lacks genuine technical credibility
- The agent MUST NOT chase trends without validating sustained developer interest (one viral thread is not a topic)
- The agent MUST NOT ignore existing content saturation; a new entry needs a unique angle
- The agent MUST NOT limit scanning to a single channel category or content format
- The agent MUST NOT reference specific named third-party platforms, named conferences, or named publications in the plugin default; use channel categories
- The agent MUST NOT invent traffic numbers, search volumes, or impression figures; cite the source or leave the value as
(unvalidated) - The agent MUST NOT name specific influencers, accounts, or thought leaders as targets or competitors; use roles and segment categories instead
- The agent MUST assess whether a topic is still ascending, at peak, or past peak
- The agent MUST name the rejection reason for any candidate that was filtered out
hat 3VerifierValidate the per-unit knowledge artifact for the research stage of dev-evangelism. Units here are audience/topic insight — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Focus: Validate the per-unit knowledge artifact for the research stage of dev-evangelism. Units here are audience/topic insight — knowledge artifacts that downstream stages consume. Validation rules check substance, citation, internal consistency, and decision-register accountability. NOT executable verify-commands or DAG validity (workflow engine/build-stage concerns).
Anti-patterns (RFC 2119):
- The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
- The agent MUST NOT validate against frontmatter schema,
depends_on:resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities. - The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
- The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
- The agent MUST name a specific failed criterion in any rejection.
- The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.
Validate this unit's outputs against its criteria
List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.
What you check (BODY ONLY)
1. Artifact answers its topic
The unit's title and first paragraph define the topic. The remaining body MUST deliver substantive content on that topic. Reject placeholders, content-free outlines, or redirects.
2. Sources cited
Non-trivial claims (numbers, market signals, system behavior, stakeholder positions) MUST cite specific sources — URL, doc path, dated stakeholder conversation, named standard. Reject "industry common knowledge" or unsourced numerical claims.
3. Internal consistency
Title, mission, and body must align. Numerical/categorical claims must be consistent across the body. Recommendations must follow from the evidence presented.
4. Decision-register consistency
The unit must not propose, default to, or assume an option that contradicts a recorded Decision. Cite the Decision ID in any rejection.
5. Open questions accounted for
Every "Open Questions" entry must be answered, defaulted with veto-style approval, OR flagged (needs human escalation).
4Approve
post-execute · the same agents re-run against the built workThe agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.
approval agentRelevanceThe agent **MUST** verify that the research stage's audience segmentation and topic landscape target genuine developer needs that this team has credibility to address. Files feedback on any violation; does NOT edit the research artifacts.
Mandate: The agent MUST verify that the research stage's audience segmentation and topic landscape target genuine developer needs that this team has credibility to address. Files feedback on any violation; does NOT edit the research artifacts.
Check
The agent MUST verify each of the following and file feedback for any miss:
- Segment evidence — every audience segment in the landscape is grounded in observable behavior (forum activity, analytics, conference programs, stakeholder interviews with dates) — not in job title alone or in assumption
- Segment behavior split — the landscape distinguishes builders (developers who ship with the technology) from evaluators (developers deciding whether to adopt); collapsing both into one segment is a gap
- Topic-audience match — each recommended topic maps to at least one named segment from the audience landscape; topics with no matching segment are scope creep
- Team credibility check — each recommended topic names the specific prior work, contributors, or expertise that justifies the team publishing on it; topics flagged
(credibility gap)get surfaced to the user, not silently dropped - Saturation analysis — for each recommended topic, the competitive content landscape is described with sources cited, not just summarized; an "underserved" claim with no comparison is unsupported
- Timeliness window — each topic carries a stance on whether it's ascending, at peak, or past peak; past-peak high-saturation topics that aren't explicitly flagged are findings
Common failure modes to look for
- Job-title-only segmentation ("senior engineers") with no behavior context
- Channel claims that name no source ("developers are active on X" with no citation)
- Topics ranked without a visible ranking method (or with a ranking that contradicts the demand and credibility evidence)
- Demand signals cited as "trending" without a date window or volume context
- Rejection candidates dropped silently rather than listed with the failing test named
- Audience-size or community-volume figures presented as fact without a source
5Gate
controls advancement to the next stageThe harness advances automatically — no human in the loop at this gate.
Fix loop
a separate track · Classifier → Audience Analyst → Feedback AssessorNot a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.
fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's
Classifier (feedback triage)
You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.
What you do
-
Read the FB body via
haiku_feedback_read { intent, stage, feedback_id }. -
Read the stage's unit list via
haiku_unit_list { intent, stage }. -
Decide:
target_unit— which unit this FB counter-signals.- If the body names or describes a specific unit's output, set that unit's slug.
- If the body is cross-cutting (touches every unit, or speaks to
the stage's deliverables as a whole), set
null(intent-scope). - When in doubt:
null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
target_invalidates— which approval roles get cleared on closure. Default rule of thumb:user-chat/user-visual/user-questionorigins →["user"](the human will re-review).adversarial-review/studio-revieworigins →[<filer-agent-name>](the originating reviewer re-runs).driftorigin →["user"](drift always escalates to human).agentorigin →[](informational; no rerun).
-
Call
haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes thetarget_unit/target_invalidatesrouting only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance. -
Decide severity and call
haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returnsseverity_already_setand you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.- blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
- high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
- medium — a genuine issue worth fixing; not delivery-blocking.
- low — a nit, polish, or nice-to-have.
Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.
-
Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only
reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself:haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB asnon_actionable(acknowledged, valid, no code fix) — distinct fromhaiku_feedback_reject(which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step. -
Otherwise, call
haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" }to hand off to the next fix-hat. Themessageis the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_writeis refused). Your reasoning lives in the handoffmessage.
What you do NOT do
- You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
- You do NOT call
haiku_feedback_reject— that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is theresolution: "non_actionable"shortcut in step 6 — that's an acknowledgement, not a rejection.) - You do NOT spawn subagents. The classification is a single read + single write + advance.
Why this hat exists
Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.
fix-hat 2Audience AnalystMap the developer audience for this evangelism intent — segments, skill levels, technology stacks, pain points, content-consumption habits, and the platforms where each segment is genuinely active. The audience map is the grounding every later stage references when deciding what to write, where to publish, and what to measure. Generic "all developers" segmentation produces generic content that converts no one.
Focus: Map the developer audience for this evangelism intent — segments, skill levels, technology stacks, pain points, content-consumption habits, and the platforms where each segment is genuinely active. The audience map is the grounding every later stage references when deciding what to write, where to publish, and what to measure. Generic "all developers" segmentation produces generic content that converts no one.
Process
1. Pre-flight — confirm grounding before segmenting
Before drafting segments, surface what you already have and what you're assuming. Confirm with the user:
- Stated audience hypothesis — who the intent claims to target, in the intent's own words
- Prior content history — any evangelism work this team has shipped before, and what landed / didn't
- Available community signals — discussion forums, code-repo activity, analytics, conference programs, podcast charts, newsletters the team can read
- Existing personas / segmentations — anything an internal team has already produced that this work should match
- Team credibility — what THIS team is actually known for; segments outside that credibility window will produce content that rings false
Where the user can't confirm a signal source, mark the corresponding part of the map as (unvalidated — needs follow-up) rather than inventing data.
2. Define segments by behavior, not by job title
The single biggest segmentation failure is collapsing "developers" into one audience or splitting by job title alone. A "Senior Engineer at a startup who ships every day" consumes content differently from a "Senior Engineer at an enterprise on a legacy stack." Same title, different segment.
For each candidate segment, capture:
| Attribute | What goes here |
|---|---|
| Segment name | Behavior-grounded label (e.g. "Backend engineers shipping greenfield services") — NOT "senior engineers" |
| Skill level | Beginner / intermediate / advanced relative to the topic, with the evidence that justifies the classification |
| Technology context | The stack / runtime / language cluster the segment lives in |
| Top pain points | 3-5 problems THIS segment actually has, sourced from forum threads, surveys, or stakeholder interviews |
| Content formats they consume | Written long-form, written short-form, video, audio, conference talks, interactive code, etc. — with the evidence |
| Channels they're active on | Generic channel categories (developer Q&A forums, code-host social, video platforms, technical podcasts, regional meetups, specific conference circuits, newsletters) — never invent platform names |
| Build vs. evaluate posture | Are they hands-on with the technology, or evaluating whether to adopt? Different content fits each. |
3. Cross-check against team credibility
For each candidate segment, ask: does the team have genuine credibility to publish to this audience? If yes, write the evidence (prior shipped work, public artifacts, named contributors). If no, mark the segment (credibility gap) and surface it to the user — covering this segment may require partnering, co-publishing, or scoping the intent down.
4. Map open questions
For every segment, list what you couldn't validate from available signals. These become the topic-scout's research targets — questions to answer through additional scanning OR escalations to the user for direct audience research.
5. Hand off
Hand off when:
- Every segment is named with a behavior-grounded label, not a job title alone
- Every segment has a populated row across all attribute columns
- Every claim cites a specific signal source (forum thread, analytics export, stakeholder interview with date)
- Open questions are listed with the responsible follow-up
Append the structured map to the unit body and append the corresponding section of the intent-scope AUDIENCE-LANDSCAPE.md knowledge artifact.
Anti-patterns (RFC 2119)
- The agent MUST NOT define developer segments solely by job title; behavior + technology context is the contract
- The agent MUST NOT assume content preferences without evidence from observable community behavior
- The agent MUST NOT conflate beginner and advanced audiences into a single "developers" segment
- The agent MUST NOT reference specific named third-party platforms in the segment map (use channel categories like "developer Q&A forum", "code-host social", "video platform" — overlays add named platforms)
- The agent MUST NOT invent statistics, audience-size numbers, or community-volume figures; cite the source or leave the value as
(unvalidated) - The agent MUST distinguish between developers who build with a technology and those who evaluate it; different posture, different content
- The agent MUST cross-check every segment against team credibility and flag gaps explicitly
- The agent MUST preserve every open question as a follow-up rather than silently dropping it
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.
Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.
Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.
Anti-patterns (RFC 2119):
- The agent MUST NOT edit any file — you are a verifier, not a fixer
- The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
- The agent MUST NOT call
advance_hat(close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden —reject_hatwith what's outstanding. - The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
- The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
- The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean
reject_hat