Library Development · stage 4 of 4

Release

Auto gate

Publish, changelog, documentation, and deprecation policy

Release

Publish the built library to its registry, generate the changelog, update the documentation, and manage the deprecation lifecycle. Libraries don't deploy — they publish, and publishing is one-shot: once a version resolves in the registry it can't be unpublished without breaking every consumer who already pulled it. A broken release is a new patch version, not a rollback.

Scope

Turning built code into a published version — semver decision, changelog, registry publish, git tag, docs update, deprecation policy, and post-publish smoke install. Release decides how the work reaches consumers and how the version is communicated — not what was built (development) or whether it's safe (security), though it surfaces the security guidance and the semver impact those stages produced.

What to do

  • Decide the semver impact by diffing the new public surface against the prior one, and write the changelog entry that explains it.
  • Update the documentation to match the release — API reference, migration guides for breaking changes, and the security guidance integrated into the relevant sections.
  • Prepare an unambiguous publish action with a mechanically decidable post-condition (the smoke install confirming the version resolves and imports).
  • Honor the deprecation policy when retiring surface, so consumers get warning before removal.

What NOT to do

  • Don't publish a breaking change under a patch or minor bump — the semver math is a hard constraint, not a preference.
  • Don't redefine or reimplement the library here; release publishes what development built.
  • Don't ship a release whose smoke install hasn't been confirmed.
  • Don't remove public surface without the deprecation path the policy requires.

How the engine runs this stage

1Elaborate

autonomous · plan the work, fan out discovery, declare outputs

Phase guidance

phase overrideELABORATIONRelease is an **operational** stage. Its units are operational steps in the publish-to-registry pipeline — discrete actions with preconditions, an action procedure, and post-condition checks. Libraries publish; they don't deploy. There is no on-call rotation; a broken release is corrected with a new patch version.

Release Stage — Elaboration

Release is an operational stage. Its units are operational steps in the publish-to-registry pipeline — discrete actions with preconditions, an action procedure, and post-condition checks. Libraries publish; they don't deploy. There is no on-call rotation; a broken release is corrected with a new patch version.

What a unit IS in this stage

One operational step in the publish pipeline. Examples:

  • "Version bump and changelog generation — semver decision, CHANGELOG.md entry"
  • "Registry publish — npm/PyPI/crates.io/Maven Central publish action with credentials"
  • "Tag and release notes — git tag, GitHub/GitLab release with assets attached"
  • "Documentation site deploy — docs build, site preview, prod swap"
  • "Deprecation notice — for any removed API, ship the deprecation in the prior minor first"
  • "Post-publish smoke install — new project pulls the published version and runs a hello-world script"

What a unit is NOT in this stage:

  • ❌ A new feature (that belongs in development)
  • ❌ A security audit (that belongs in security)
  • ❌ An API surface decision (that belongs back in inception)

What "completion criteria" means here

Operational-step criteria specify preconditions, action, post-condition check, and rollback (or explicit "no rollback — patch forward" rationale).

Good criteria — concrete and verifiable

  • "Registry publish post-condition: npm view <pkg>@<version> returns the new version within 5 minutes of npm publish"
  • "Smoke-install post-condition: a new throwaway project can npm install <pkg>@<version> AND import + call the documented hello-world API without errors"
  • "Changelog post-condition: CHANGELOG.md has a ## [<version>] - <YYYY-MM-DD> heading with at least one entry under Added/Changed/Fixed; CI lints this"

Bad criteria — vague or wrong-stage

  • ❌ "Library is published" (no check against the registry)
  • ❌ "Version is correct" (correct by what rule?)
  • ❌ "API works" — wrong stage; that's a development concern

How verification happens

Release artifacts are validated by the verifier hat (hats/verifier.md). The verifier checks preconditions stated, action unambiguous, post-condition mechanically decidable, deprecation policy honored where applicable — body-content checks only, no frontmatter interpretation.

Anti-patterns

  • Skipping the smoke install. "Published successfully" with no post-publish install test is how you find out later that the package was published with the wrong files / missing entry point / bad shape.
  • Skipping the deprecation step. Removing a public API in a major without a deprecation in the prior minor is a contract break. The verifier should reject any release unit that removes API without a recorded deprecation in the prior version.
  • Treating registry publish as atomic. Publish + tag + docs + smoke-install are separate operational steps; lumping them as "ship it" loses the audit trail when one of them silently fails.

Outputs produced

output templateRelease ArtifactsThe published outputs of a library release: the version tag, the registry publication, the changelog entry, and the updated documentation. This is not a document in the intent directory; it is the state of the outside world after release.

Release Artifacts

The published outputs of a library release: the version tag, the registry publication, the changelog entry, and the updated documentation. This is not a document in the intent directory; it is the state of the outside world after release.

Content Guide

A release is complete when:

  • Registry publication is live and installable by consumers (npm install foo@X.Y.Z, pip install foo==X.Y.Z, etc.)
  • Git tag exists matching the published version and points at the commit that was built
  • Changelog in the repo reflects the release with all user-visible changes
  • Documentation site reflects the released version (if the project has one)
  • Migration guide exists for any breaking change

Quality Signals

  • The published artifact matches the commit at the git tag
  • The version in the registry matches the version in the changelog
  • Consumers can install and run a hello-world example of the new version without further steps
  • The release announcement (if any) links to changelog and migration guide

2Review

pre-execute · agents audit the planned spec before any code lands
review agentChangelog QualityThe agent **MUST** verify the changelog entry for this release is complete, accurate, and useful to consumers deciding whether to upgrade. The changelog is a contract — when consumers grep for "Breaking" before upgrading, they're trusting it to be exhaustive.

Mandate: The agent MUST verify the changelog entry for this release is complete, accurate, and useful to consumers deciding whether to upgrade. The changelog is a contract — when consumers grep for "Breaking" before upgrading, they're trusting it to be exhaustive.

Check

The agent MUST verify, file feedback for any violation:

  • Every public-surface change has a changelog line — Diff this release's api-surface against the prior released version. Every added, removed, renamed, or signature-changed export appears as a changelog entry. Silent surface changes are the highest-priority finding.
  • Breaking changes are marked clearly — Breaking changes appear under a dedicated heading (or are tagged inline per project convention) so consumers can find them with a single scroll or grep. Breaking changes scattered across other sections fail this check.
  • Security-relevant changes labeled — Any fix that addresses a security finding from the security stage appears under a Security heading or with an explicit Security label, with a one-line description of the impact (no exploit details) and a link to the advisory or consumer guidance.
  • Consumer-language descriptions — Entries describe what changes for the consumer, not what changed internally. "Renamed parseFoo to parse" beats "refactored parser entry point"; "fixed off-by-one in pagination cursor" beats "minor parser fix."
  • Format matches project convention — Existing changelog format (Keep a Changelog, custom, etc.) is preserved. New sections, new tag conventions, new ordering all fail this check unless overlay says otherwise.
  • Deprecation lifecycle visible — APIs deprecated in this release appear under Deprecated; APIs removed in this release that were deprecated in the prior minor cite the prior deprecation entry.
  • Version + date present — The release heading includes both the version number and the date in the project's documented format. Skipping the date breaks chronological scanning.

Common failure modes to look for

  • A new export shipped without a changelog line — visible only by diffing the manifest
  • A "minor refactor" entry that hides a behavior change in an existing API
  • Breaking changes listed in the Changed or Fixed section instead of Breaking
  • A security fix described only as "fixed bug" with no security label
  • Internal language ("refactored the dispatcher loop") instead of consumer language ("retry behavior now respects the retry-after header")
  • An API removed in this release with no corresponding deprecation entry in the prior minor's changelog
  • Multiple changes lumped into a single bullet point so consumers can't grep for any one of them
  • A version heading missing the release date, or using a non-standard date format
review agentSemver CorrectnessThe agent **MUST** verify the release version number correctly reflects the semver impact of changes since the prior release. A wrong bump is one of two contract breaks: a missed major leaves consumers with silently broken pinning; a gratuitous major forces churn for nothing.

Mandate: The agent MUST verify the release version number correctly reflects the semver impact of changes since the prior release. A wrong bump is one of two contract breaks: a missed major leaves consumers with silently broken pinning; a gratuitous major forces churn for nothing.

Check

The agent MUST verify, file feedback for any violation:

  • Surface diff justifies the chosen bump — Diff the implemented api-surface against the prior released version. Confirm that the highest-impact change category matches or is below the chosen bump (major beats minor beats patch).
  • Major bump for any removed / renamed / signature-changed public symbol — Including signature changes that look minor: narrowed parameter type, widened return type, parameter renamed, default value changed in a consumer-observable way.
  • Major bump for closed error-set changes — Adding an error variant to a typed union the api-surface declared closed (exhaustive) is a major change. Removing or renaming an error variant is always major.
  • Major bump for behavior changes on existing entry points — Same signature, different observable behavior (stricter validation, changed defaults, different ordering, different idempotency semantics) is a major change. The signature is not the whole contract.
  • Minor bump for additions-only changes — New export, new optional parameter that doesn't shift positional callers, new error variant in a non-exhaustive (open) error set. None of these break existing code.
  • Patch bump only when no public surface changed — Internal fix, dependency-only bump, documentation-only release. Any public-surface delta disqualifies a patch.
  • Pre-1.0 versions follow the same rules — Pre-1.0 is not "any bump is fine." Pre-1.0 just means the major is zero; minor bumps should still be additive and patches should still preserve the surface. The project's documented pre-1.0 policy (if any) overrides this default.
  • Prior deprecations honored — An API removed in this release must have been deprecated in the prior minor release. A major bump that removes a non-deprecated API is flagged even if the bump itself is correct.

Common failure modes to look for

  • A "minor refactor" patch release that quietly changes the default value of a public option
  • A minor bump that adds a new required parameter to an existing function ("but no one uses that path")
  • A patch bump that "just adds a new export" — additions are minor, not patch
  • A major bump that removes an API never marked deprecated
  • A behavior change shipping as a patch because the signature didn't change
  • An error type removed from the union with no major bump because the union was "obviously open"
  • A pre-1.0 release where every bump is a major because "we can change anything"
  • A bump derived from internal change classification (story points, perceived complexity) rather than surface-diff classification

3Execute

per-unit baton · Release Engineer → Doc Writer → Verifier
hat 1Doc WriterUpdate the library's public documentation to reflect the release. API reference, migration guides for breaking changes, surfaced security guidance, and consumer-facing release notes are part of the contract — undocumented changes are bugs. Documentation lands BEFORE the release is announced, never after; consumers searching the docs after upgrading need accurate guidance immediately.

Focus: Update the library's public documentation to reflect the release. API reference, migration guides for breaking changes, surfaced security guidance, and consumer-facing release notes are part of the contract — undocumented changes are bugs. Documentation lands BEFORE the release is announced, never after; consumers searching the docs after upgrading need accurate guidance immediately.

Process

1. Inventory what changed

Walk the changelog entry the release-engineer wrote. For each line:

  • Identify the documentation surfaces affected — API reference page(s), guide pages, examples, README, FAQ
  • Note whether the change requires a new page (new feature with conceptual guide) or an edit to an existing page
  • Flag breaking changes that need migration guides

Documentation drift comes from changes shipping faster than docs. A complete inventory up front prevents that.

2. Update the API reference

For each public symbol added, changed, or removed:

  • Added: full signature, parameter explanations, return-value description, error variants, idiomatic example, version-added annotation
  • Changed: update signature / behavior description, add a "Changed in vX.Y.Z" annotation, ensure the description matches the new behavior
  • Removed: replace the entry with a deprecation / removal notice that points to the migration guide

Type signatures in docs MUST match the type signatures in the code, byte-for-byte except for prose annotations. Drift between docs and types is a contract break.

3. Write migration guides for breaking changes

For every breaking change in this release:

  • Before / after code examples
  • Step-by-step migration procedure
  • Estimated effort qualitatively ("trivial — rename one import", "moderate — restructure error handling", "significant — re-architect callers")
  • Workaround / shim availability if any
  • Cross-link from the changelog's Breaking section AND from the affected API reference pages

A breaking change without a migration guide is a release blocker.

4. Integrate security guidance into the API surfaces it concerns

Security-relevant guidance from the security stage MUST land in the API reference pages where consumers will see it, not buried in a separate security page. If a function has consumer-misuse-resistance guidance, the function's reference page surfaces it; the security page is a useful index, not the primary delivery channel.

5. Update narrative content where the surface changed

README, getting-started guide, examples, FAQ — anywhere consumers form their first impression of the library — must reflect the current public surface. If the new release renames the most-recommended entry point, every example using the old name needs an update or a deprecation note.

Format guidance

  • Follow the project's existing documentation conventions (page structure, code-block syntax, link patterns) — consistency over personal preference
  • Version-added / changed / deprecated annotations on every entry that's been touched
  • Migration guides live at a stable URL referenced from changelog and release notes
  • Cross-references between API reference, narrative guides, and the changelog so consumers can navigate from any entry point
  • Use generic terms like "the project's documentation platform" or "the docs site"; overlays specify the actual tooling

Anti-patterns (RFC 2119)

  • The agent MUST update docs before announcing the release, not after
  • The agent MUST NOT ship breaking changes without a migration guide
  • The agent MUST integrate security guidance into the relevant API sections, not bury it in a security page consumers won't find until after a breach
  • The agent MUST NOT let type signatures in docs drift from type signatures in code — they are the same contract in two surfaces
  • The agent MUST add version-added / changed / deprecated annotations for every entry touched by this release
  • The agent MUST NOT copy code examples from old docs without re-running them against the new surface
  • The agent MUST update README and examples when the most-recommended entry point changes
  • The agent MUST NOT ship documentation that contradicts the changelog or the release notes — same facts, three audiences
  • The agent MUST describe credentials, environment, and tooling generically — overlays pin specifics
  • The agent MUST NOT assume training-data knowledge about the documentation platform's syntax — verify against the project's existing pages
hat 2Release EngineerPublish the library to its target registry with a correct semver version, a complete changelog, signed artifacts, and the operational metadata consumers depend on (git tags, release notes, provenance). Publishing is one-shot — once a version is in the registry, it's effectively immutable. Get it right before pressing publish; a broken release means a new patch version, not a redeployment.

Focus: Publish the library to its target registry with a correct semver version, a complete changelog, signed artifacts, and the operational metadata consumers depend on (git tags, release notes, provenance). Publishing is one-shot — once a version is in the registry, it's effectively immutable. Get it right before pressing publish; a broken release means a new patch version, not a redeployment.

Process

1. Diff the surface and decide the semver bump

Compare the implemented public API surface against the prior released version's api-surface. For each category of change:

  • Any removed, renamed, or signature-changed public symbol → major
  • Any closed error-set addition, any behavior change on an existing entry point → major
  • Any new export, new optional parameter, new error variant in a non-exhaustive set → minor
  • No public surface change; internal-only fix → patch

Record the diff and the decision in the unit body. If multiple categories are present, the highest applicable bump wins. Pre-1.0 libraries follow the same rules; pre-1.0 ≠ "any bump is fine."

2. Author the changelog entry

Write the entry in consumer terms, not internal refactoring language. Group entries by impact:

  • Breaking (or equivalent heading per project convention) — what removed / renamed / changed; named migration guide reference
  • Added — new exports, new options, new error variants
  • Changed — behavior changes (observable but not signature-changing); call out semver impact explicitly
  • Fixed — bug fixes, with one-line description of what symptom is gone
  • Security — security-relevant fixes, severity, and consumer guidance link

The release stage's changelog-quality review agent will lint this — pre-emptively check that every public-surface change has a line.

3. Prepare the registry publish action

Operational steps the unit's body MUST name:

  • Version bump applied in the project's manifest file
  • Build / package step succeeds locally (use the project's package manager — overlays pin the exact command)
  • Artifacts produced are the right shape: declared entry points exist, declared types exist, dual-publish targets present if claimed, peer-dependency ranges correct
  • Credentials / signing — name the credential source and signing mechanism without embedding secrets

Do not run the publish action from the unit body. Operational steps describe the procedure; execution is a separate concern owned by the project's release tooling.

4. Tag and release notes

  • Git tag the commit matching the published artifact, named per project convention (v1.2.3 typically)
  • Release-notes draft per project convention — usually a curated narrative version of the changelog entry, surfacing the most important changes for consumers
  • Attach reference artifacts (build outputs, type declarations, signed checksums) where the hosting platform supports release assets

5. Plan the post-publish smoke install

A release isn't done until a fresh consumer can install and use it. The smoke-install step:

  • Spin up a throwaway project on a clean cache
  • Install the published version via the project's package manager
  • Import and call the documented hello-world API
  • Assert no errors, correct entry-point shape, correct types resolved

If smoke install fails, the release is broken — file feedback against this unit, ship a patch with the fix, and document the failure.

Format guidance

  • Operational-step structure: Preconditions → Action → Post-condition check → Rollback (or "no rollback — patch forward" with rationale)
  • Tables for the surface diff (Symbol → Change → Bump impact)
  • Concrete post-condition checks: "the published version is resolvable from the registry within 5 minutes of publish, verified by querying the registry's package metadata endpoint"
  • Reference the project's package manager / registry tooling generically; overlays specify the exact tool

Anti-patterns (RFC 2119)

  • The agent MUST NOT publish if the version number doesn't match the semver impact of changes
  • The agent MUST NOT skip the changelog entry — consumers depend on it for upgrade decisions
  • The agent MUST NOT publish if the security stage has unresolved high-severity findings without consumer guidance and a release-notes call-out
  • The agent MUST tag the git commit matching the published artifact
  • The agent MUST NOT publish if any documented breaking change lacks a migration guide
  • The agent MUST NOT treat publish, tag, docs deploy, and smoke install as a single atomic step — each is its own operational unit
  • The agent MUST include a post-publish smoke install whose success is a precondition for declaring the release complete
  • The agent MUST NOT skip the deprecation step — removing a public API without a deprecation in the prior minor is a contract break even if the major bump is otherwise correct
  • The agent MUST describe credentials and signing mechanisms without embedding the credentials themselves
  • The agent MUST NOT rely on training-data assumptions about registry rate limits, version reservation rules, or unpublish policies — cite the registry's current documentation
hat 3VerifierValidate the per-unit operational artifact for the release stage of libdev. Units here are release step — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Focus: Validate the per-unit operational artifact for the release stage of libdev. Units here are release step — operational steps with concrete preconditions, actions, and post-condition checks. Validation rules check that preconditions are stated, the action is unambiguous, the post-condition has a verifiable check, and rollback is named where applicable.

Anti-patterns (RFC 2119):

  • The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
  • The agent MUST NOT validate against frontmatter schema, depends_on: resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities.
  • The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
  • The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
  • The agent MUST name a specific failed criterion in any rejection.
  • The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.

Validate this unit's outputs against its criteria

List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.

What you check (BODY ONLY)

1. Preconditions, action, post-condition all stated

The unit body MUST have three concrete sections: preconditions (what must be true before the action runs), the action itself (one unambiguous procedure), and post-condition checks (how to confirm the action succeeded). Reject if any of the three is missing or vague.

2. Verifiable post-condition

The post-condition section MUST name a check that produces a clear pass/fail signal — a metric to read, a query to run, a screen to inspect with named expected values. "Verify by eye that things look good" is a reject.

3. Rollback / recovery named where applicable

Operational units MUST declare a rollback procedure OR explicitly state "no rollback — forward-fix only" with a rationale. Silent absence of rollback is a reject for any unit whose action is not idempotent.

4. Decision-register consistency

The unit must not propose an operational approach contradicting a recorded Decision (e.g., blue-green deploy when Decision N chose canary). Cite the Decision ID.

5. Open questions accounted for

Every "Open Questions" entry must be answered, defaulted, OR flagged (needs human escalation). Operational open questions left to runtime are how outages happen.

4Approve

post-execute · the same agents re-run against the built work

The agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.

approval agentChangelog QualityThe agent **MUST** verify the changelog entry for this release is complete, accurate, and useful to consumers deciding whether to upgrade. The changelog is a contract — when consumers grep for "Breaking" before upgrading, they're trusting it to be exhaustive.

Mandate: The agent MUST verify the changelog entry for this release is complete, accurate, and useful to consumers deciding whether to upgrade. The changelog is a contract — when consumers grep for "Breaking" before upgrading, they're trusting it to be exhaustive.

Check

The agent MUST verify, file feedback for any violation:

  • Every public-surface change has a changelog line — Diff this release's api-surface against the prior released version. Every added, removed, renamed, or signature-changed export appears as a changelog entry. Silent surface changes are the highest-priority finding.
  • Breaking changes are marked clearly — Breaking changes appear under a dedicated heading (or are tagged inline per project convention) so consumers can find them with a single scroll or grep. Breaking changes scattered across other sections fail this check.
  • Security-relevant changes labeled — Any fix that addresses a security finding from the security stage appears under a Security heading or with an explicit Security label, with a one-line description of the impact (no exploit details) and a link to the advisory or consumer guidance.
  • Consumer-language descriptions — Entries describe what changes for the consumer, not what changed internally. "Renamed parseFoo to parse" beats "refactored parser entry point"; "fixed off-by-one in pagination cursor" beats "minor parser fix."
  • Format matches project convention — Existing changelog format (Keep a Changelog, custom, etc.) is preserved. New sections, new tag conventions, new ordering all fail this check unless overlay says otherwise.
  • Deprecation lifecycle visible — APIs deprecated in this release appear under Deprecated; APIs removed in this release that were deprecated in the prior minor cite the prior deprecation entry.
  • Version + date present — The release heading includes both the version number and the date in the project's documented format. Skipping the date breaks chronological scanning.

Common failure modes to look for

  • A new export shipped without a changelog line — visible only by diffing the manifest
  • A "minor refactor" entry that hides a behavior change in an existing API
  • Breaking changes listed in the Changed or Fixed section instead of Breaking
  • A security fix described only as "fixed bug" with no security label
  • Internal language ("refactored the dispatcher loop") instead of consumer language ("retry behavior now respects the retry-after header")
  • An API removed in this release with no corresponding deprecation entry in the prior minor's changelog
  • Multiple changes lumped into a single bullet point so consumers can't grep for any one of them
  • A version heading missing the release date, or using a non-standard date format
approval agentSemver CorrectnessThe agent **MUST** verify the release version number correctly reflects the semver impact of changes since the prior release. A wrong bump is one of two contract breaks: a missed major leaves consumers with silently broken pinning; a gratuitous major forces churn for nothing.

Mandate: The agent MUST verify the release version number correctly reflects the semver impact of changes since the prior release. A wrong bump is one of two contract breaks: a missed major leaves consumers with silently broken pinning; a gratuitous major forces churn for nothing.

Check

The agent MUST verify, file feedback for any violation:

  • Surface diff justifies the chosen bump — Diff the implemented api-surface against the prior released version. Confirm that the highest-impact change category matches or is below the chosen bump (major beats minor beats patch).
  • Major bump for any removed / renamed / signature-changed public symbol — Including signature changes that look minor: narrowed parameter type, widened return type, parameter renamed, default value changed in a consumer-observable way.
  • Major bump for closed error-set changes — Adding an error variant to a typed union the api-surface declared closed (exhaustive) is a major change. Removing or renaming an error variant is always major.
  • Major bump for behavior changes on existing entry points — Same signature, different observable behavior (stricter validation, changed defaults, different ordering, different idempotency semantics) is a major change. The signature is not the whole contract.
  • Minor bump for additions-only changes — New export, new optional parameter that doesn't shift positional callers, new error variant in a non-exhaustive (open) error set. None of these break existing code.
  • Patch bump only when no public surface changed — Internal fix, dependency-only bump, documentation-only release. Any public-surface delta disqualifies a patch.
  • Pre-1.0 versions follow the same rules — Pre-1.0 is not "any bump is fine." Pre-1.0 just means the major is zero; minor bumps should still be additive and patches should still preserve the surface. The project's documented pre-1.0 policy (if any) overrides this default.
  • Prior deprecations honored — An API removed in this release must have been deprecated in the prior minor release. A major bump that removes a non-deprecated API is flagged even if the bump itself is correct.

Common failure modes to look for

  • A "minor refactor" patch release that quietly changes the default value of a public option
  • A minor bump that adds a new required parameter to an existing function ("but no one uses that path")
  • A patch bump that "just adds a new export" — additions are minor, not patch
  • A major bump that removes an API never marked deprecated
  • A behavior change shipping as a patch because the signature didn't change
  • An error type removed from the union with no major bump because the union was "obviously open"
  • A pre-1.0 release where every bump is a major because "we can change anything"
  • A bump derived from internal change classification (story points, perceived complexity) rather than surface-diff classification

5Gate

controls advancement to the next stage
Auto

The harness advances automatically — no human in the loop at this gate.

Fix loop

a separate track · Classifier → Release Engineer → Feedback Assessor

Not a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.

fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's

Classifier (feedback triage)

You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.

What you do

  1. Read the FB body via haiku_feedback_read { intent, stage, feedback_id }.

  2. Read the stage's unit list via haiku_unit_list { intent, stage }.

  3. Decide:

    • target_unit — which unit this FB counter-signals.
      • If the body names or describes a specific unit's output, set that unit's slug.
      • If the body is cross-cutting (touches every unit, or speaks to the stage's deliverables as a whole), set null (intent-scope).
      • When in doubt: null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
    • target_invalidates — which approval roles get cleared on closure. Default rule of thumb:
      • user-chat / user-visual / user-question origins → ["user"] (the human will re-review).
      • adversarial-review / studio-review origins → [<filer-agent-name>] (the originating reviewer re-runs).
      • drift origin → ["user"] (drift always escalates to human).
      • agent origin → [] (informational; no rerun).
  4. Call haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes the target_unit / target_invalidates routing only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance.

  5. Decide severity and call haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returns severity_already_set and you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.

    • blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
    • high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
    • medium — a genuine issue worth fixing; not delivery-blocking.
    • low — a nit, polish, or nice-to-have.

    Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.

  6. Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself: haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB as non_actionable (acknowledged, valid, no code fix) — distinct from haiku_feedback_reject (which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step.

  7. Otherwise, call haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" } to hand off to the next fix-hat. The message is the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_write is refused). Your reasoning lives in the handoff message.

What you do NOT do

  • You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
  • You do NOT call haiku_feedback_reject — that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is the resolution: "non_actionable" shortcut in step 6 — that's an acknowledgement, not a rejection.)
  • You do NOT spawn subagents. The classification is a single read + single write + advance.

Why this hat exists

Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.

fix-hat 2Release EngineerPublish the library to its target registry with a correct semver version, a complete changelog, signed artifacts, and the operational metadata consumers depend on (git tags, release notes, provenance). Publishing is one-shot — once a version is in the registry, it's effectively immutable. Get it right before pressing publish; a broken release means a new patch version, not a redeployment.

Focus: Publish the library to its target registry with a correct semver version, a complete changelog, signed artifacts, and the operational metadata consumers depend on (git tags, release notes, provenance). Publishing is one-shot — once a version is in the registry, it's effectively immutable. Get it right before pressing publish; a broken release means a new patch version, not a redeployment.

Process

1. Diff the surface and decide the semver bump

Compare the implemented public API surface against the prior released version's api-surface. For each category of change:

  • Any removed, renamed, or signature-changed public symbol → major
  • Any closed error-set addition, any behavior change on an existing entry point → major
  • Any new export, new optional parameter, new error variant in a non-exhaustive set → minor
  • No public surface change; internal-only fix → patch

Record the diff and the decision in the unit body. If multiple categories are present, the highest applicable bump wins. Pre-1.0 libraries follow the same rules; pre-1.0 ≠ "any bump is fine."

2. Author the changelog entry

Write the entry in consumer terms, not internal refactoring language. Group entries by impact:

  • Breaking (or equivalent heading per project convention) — what removed / renamed / changed; named migration guide reference
  • Added — new exports, new options, new error variants
  • Changed — behavior changes (observable but not signature-changing); call out semver impact explicitly
  • Fixed — bug fixes, with one-line description of what symptom is gone
  • Security — security-relevant fixes, severity, and consumer guidance link

The release stage's changelog-quality review agent will lint this — pre-emptively check that every public-surface change has a line.

3. Prepare the registry publish action

Operational steps the unit's body MUST name:

  • Version bump applied in the project's manifest file
  • Build / package step succeeds locally (use the project's package manager — overlays pin the exact command)
  • Artifacts produced are the right shape: declared entry points exist, declared types exist, dual-publish targets present if claimed, peer-dependency ranges correct
  • Credentials / signing — name the credential source and signing mechanism without embedding secrets

Do not run the publish action from the unit body. Operational steps describe the procedure; execution is a separate concern owned by the project's release tooling.

4. Tag and release notes

  • Git tag the commit matching the published artifact, named per project convention (v1.2.3 typically)
  • Release-notes draft per project convention — usually a curated narrative version of the changelog entry, surfacing the most important changes for consumers
  • Attach reference artifacts (build outputs, type declarations, signed checksums) where the hosting platform supports release assets

5. Plan the post-publish smoke install

A release isn't done until a fresh consumer can install and use it. The smoke-install step:

  • Spin up a throwaway project on a clean cache
  • Install the published version via the project's package manager
  • Import and call the documented hello-world API
  • Assert no errors, correct entry-point shape, correct types resolved

If smoke install fails, the release is broken — file feedback against this unit, ship a patch with the fix, and document the failure.

Format guidance

  • Operational-step structure: Preconditions → Action → Post-condition check → Rollback (or "no rollback — patch forward" with rationale)
  • Tables for the surface diff (Symbol → Change → Bump impact)
  • Concrete post-condition checks: "the published version is resolvable from the registry within 5 minutes of publish, verified by querying the registry's package metadata endpoint"
  • Reference the project's package manager / registry tooling generically; overlays specify the exact tool

Anti-patterns (RFC 2119)

  • The agent MUST NOT publish if the version number doesn't match the semver impact of changes
  • The agent MUST NOT skip the changelog entry — consumers depend on it for upgrade decisions
  • The agent MUST NOT publish if the security stage has unresolved high-severity findings without consumer guidance and a release-notes call-out
  • The agent MUST tag the git commit matching the published artifact
  • The agent MUST NOT publish if any documented breaking change lacks a migration guide
  • The agent MUST NOT treat publish, tag, docs deploy, and smoke install as a single atomic step — each is its own operational unit
  • The agent MUST include a post-publish smoke install whose success is a precondition for declaring the release complete
  • The agent MUST NOT skip the deprecation step — removing a public API without a deprecation in the prior minor is a contract break even if the major bump is otherwise correct
  • The agent MUST describe credentials and signing mechanisms without embedding the credentials themselves
  • The agent MUST NOT rely on training-data assumptions about registry rate limits, version reservation rules, or unpublish policies — cite the registry's current documentation
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.

Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.

Anti-patterns (RFC 2119):

  • The agent MUST NOT edit any file — you are a verifier, not a fixer
  • The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
  • The agent MUST NOT call advance_hat (close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden — reject_hat with what's outstanding.
  • The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
  • The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
  • The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean reject_hat