Engineering

Library Development Studio

Lifecycle for libraries, SDKs, and CLI tools

4 stages13 hats10 review agentsPersistence: auto-detected

Deprecated 2026-05-27. Library / SDK / CLI work folded into the software (appdev) studio, which now carries an optional release stage and drops design / product / operations for non-application intents. This studio is hidden from new-intent pickers but stays resolvable so in-flight studio: libdev intents finish unchanged. Do not delete.

Library Development

Lifecycle for libraries, SDKs, and CLI tools. Differs from application development: no product or design phases — inception directly covers discovery and API surface (semver policy, public contract, extension points), development builds against that contract, and release publishes rather than deploys.

The lifecycle an intent runs

1InceptionAsk gateUnderstand the problem, define API surface, and elaborate into units4 hats · 3 review agents · 3-step fix loop · 2 discovery · 1 outputs 2DevelopmentExternal / Ask gateImplement the library against the API contract from inception3 hats · 3 review agents · 3-step fix loop · 1 outputs 3SecurityExternal / Ask gateSupply chain, dependency audit, and consumer-misuse threat model3 hats · 2 review agents · 3-step fix loop · 1 outputs 4ReleaseAuto gatePublish, changelog, documentation, and deprecation policy3 hats · 2 review agents · 3-step fix loop · 1 outputs

At intent close

After the final stage's gate passes, the engine runs one studio-wide pass over the whole intent — review the delivered work, fix anything it flags, then reflect on the cycle.

Intent-completion review

studio-wide agents audit the delivered intent

Cross Stage ConsistencyVerify the intent's artifacts are internally consistent across stages. You are the only reviewer that sees the whole intent at once — your job is to catch seams that per-stage reviewers miss.

Mandate: Verify the intent's artifacts are internally consistent across stages. You are the only reviewer that sees the whole intent at once — your job is to catch seams that per-stage reviewers miss.

Check:

The agent MUST verify that each stage's outputs align with what upstream stages specified — no dropped requirements, no silent scope expansion
The agent MUST verify that naming is consistent across stages — a concept named one thing upstream should carry the same name downstream
The agent MUST verify that stages' declared outputs exist at the paths their unit frontmatter promised
The agent MUST verify that the stages collectively deliver the intent's stated goal (read intent.md) — partial delivery is a finding
The agent MUST verify that concerns raised by any stage's review were actually addressed (not silently ignored)

Anti-patterns (RFC 2119):

The agent MUST NOT re-litigate decisions already approved at each stage's gate
The agent MUST NOT propose new features or scope additions
The agent MUST NOT flag stylistic preferences — concrete divergence only

Delivery VerifierThe agent **MUST** confirm the intent is actually *deliverable* before it closes — that the team's own CI gate is green on the delivery PR, and that every human who reviewed the PR has had their concerns addressed. The `runtime-verifier` lens confirms the app **runs** when you drive it locally; this lens confirms something independent: that the work **passes the checks the repo gates merges on**, and that the PR conversation is resolved. A build that boots clean on one machine and a CI run that fails on a pinned-dependency mismatch, a lint rule, a typecheck error, or a test that only runs in the clean CI environment are all completely consistent with each other. "It works on my machine" is not "CI is green." Both gates must hold.

Mandate: The agent MUST confirm the intent is actually deliverable before it closes — that the team's own CI gate is green on the delivery PR, and that every human who reviewed the PR has had their concerns addressed. The runtime-verifier lens confirms the app runs when you drive it locally; this lens confirms something independent: that the work passes the checks the repo gates merges on, and that the PR conversation is resolved. A build that boots clean on one machine and a CI run that fails on a pinned-dependency mismatch, a lint rule, a typecheck error, or a test that only runs in the clean CI environment are all completely consistent with each other. "It works on my machine" is not "CI is green." Both gates must hold.

This lens's subject is the delivery PR on the remote, not the local artifacts. When you have provider access — an authenticated VCS CLI (gh for GitHub, glab for GitLab) or a configured provider — you read its checks and its review conversation, reply to and resolve review threads, and file findings for anything that isn't green or isn't addressed; the studio fix-hat loop lands the code, and you re-audit until the PR is clean. You cannot assume that access exists: there may be no remote, no CLI, or a CLI that isn't authenticated. The rule that survives every one of those cases is the same — you never sign off on a delivery you couldn't actually verify. A check you couldn't run is not a check that passed.

Resolve the delivery PR — and what you can prove without a provider

Work the cheapest, most reliable signal first, because it needs no provider at all:

Is the work already merged? Ask local git (no CLI, no auth, no network): is the intent's branch haiku/<intent>/main an ancestor of the repo's mainline (git merge-base --is-ancestor haiku/<intent>/main <main|master|the repo's default branch>)? If it's merged, that IS your proof. A host only lets a PR merge once its branch protection is satisfied — CI green, required reviews approved. The merge is the host's own gate firing; you don't need to re-read CI to trust it. Sign off (note "delivered: haiku/<intent>/main merged into <mainline> — host gate satisfied").

If it's NOT merged, you need to verify the open PR — and that's where provider access decides your path:

No git remote at all (git remote -v is empty) → there is genuinely nothing to gate on. Terminate clean: "no remote — CI verification not applicable." This is a SKIP.
A remote exists and you HAVE provider access → resolve the delivery PR (external_refs.git_pr via haiku_intent_get, else gh pr list --head haiku/<intent>/main --state open / glab mr list) and verify it (the sections below). A remote exists but no open delivery PR was found → that IS a finding: the work has nowhere to be reviewed and gated. File it and stop.
A remote exists but you have NO provider access (no CLI, or it isn't authenticated) and the branch is NOT merged → you are blind to a gate that exists, and that is NOT a SKIP. You cannot confirm CI is green or the conversation is resolved from here, and the work hasn't merged, so it is not yet deliverable. File ONE finding (see "When you can't verify" below) that escalates to the human, and do NOT sign off. The previous behavior — quietly skipping when no CLI was present — is exactly the false green this lens exists to stop.

Check CI is green

Wait for checks to finish, then read their conclusions: gh pr checks <pr> --watch (GitHub) blocks until every check completes. The point of this lens is to ensure the thing can pass CI, so waiting for the run to settle is the job — don't sign off on a still-running pipeline, and don't file a "still running" finding either; let it complete and judge the result.
All checks success / neutral / skipped → CI is clear of failures. That's necessary, not sufficient — a pipeline that runs nothing also passes. Green is half the question; the other half is the next section.
Any check failed, cancelled, or timed out → open ONE haiku_feedback per distinct failure. Pull the actual failure detail first (gh run view <run-id> --log-failed, or the failing check's detailsUrl) so the finding is concrete: name the failing check, quote the failing command and the error excerpt, and point at the file/line when the log gives one. A finding a builder can act on without re-deriving what broke is the bar — "CI is red" with no specifics is not actionable.
The PR must actually be mergeable, not just green. Read its merge state (gh pr view <pr> --json isDraft,mergeable,mergeStateStatus; the glab mr view equivalent). A PR that's still a draft, has merge conflicts (mergeable: CONFLICTING), or is otherwise blocked from merging is not deliverable even with every check green — open ONE finding naming the blocker (mark a draft for "ready for review", rebase/resolve the conflict). Green checks on an unmergeable PR is the same false confidence as a green no-op check.

Check CI is meaningful, not just green

A green checkmark on a pipeline that doesn't run anything is worse than no pipeline — it manufactures false confidence that nobody re-checks. Green answers "did the checks that ran pass?" This section answers the equally important question: "are the checks that ran the ones that matter?"

The intent's own quality gates are the reference set. Each unit declared executable quality_gates: — the commands the work committed to passing. Read them: haiku_unit_list, then haiku_unit_get { intent, stage, unit, field: "quality_gates" } per unit; the union across units is the bar the work set for itself. Those gates are exactly the checks that must have a home on the remote. A gate the work declared (bun test, tsc --noEmit, an eslint/biome run, a build command) that no CI job runs means the remote gate is weaker than the work's own bar — open ONE finding naming the unrun gate and the job that should carry it. The fix-hat loop wires it in.
Read what the jobs actually do, not just their names. Pull the pipeline config (.github/workflows/*.yml, .gitlab-ci.yml) and the run logs (gh run view <run-id> --log). A job named "test" whose script is echo ok / exit 0 / true, a test step that reports "0 tests" / "no tests found" / "0 passed", a check that's if:-gated or path-filtered so it never actually ran on this PR — each is a hollow gate. File a finding: the check exists but enforces nothing.
No CI at all, but the work declared executable quality gates → that IS a finding, not a skip. The intent set a verifiable bar for itself and shipped to a remote with nothing enforcing that bar. The fix-hat loop adds the pipeline that runs those gates.
Legitimately nothing to enforce → only when the intent declares NO executable quality gates (a docs / research / non-code deliverable with no commands to run) is "no CI" a real SKIP. State that plainly and don't invent a check the work never asked for.

Address the PR conversation

Read the review threads on the PR (gh pr view <pr> --json reviews,comments, and the per-thread review comments via gh api repos/{owner}/{repo}/pulls/<n>/comments). A GitLab PR uses the glab discussion equivalents.
For each unresolved, actionable review comment, open ONE haiku_feedback capturing it: quote the reviewer's comment, name the file and line it sits on, and link the thread. Skip comments that are already resolved, are pure acknowledgements ("nice", "lgtm"), or are answered questions with no code implication — only real, open, change-requesting threads become findings.
For each thread whose concern is already satisfied in the PR's current commits (because a previous pass's finding was fixed by the fix-hat loop), reply on the thread noting it's addressed and pointing at the commit that did it (addressed in <sha>), then resolve the thread. This is the only mutation you make on the repo — you reply and resolve; you never edit the code yourself.

If there's a git remote, the work isn't merged, and you have no way to reach the provider — no gh/glab, or it isn't authenticated, or no provider is configured — you cannot see CI or the conversation, and you must not treat that like the no-remote SKIP. A gate exists; you're just blind to it. Do this:

File ONE haiku_feedback (intent scope) titled e.g. "Delivery unverified — no provider access to confirm CI/review on haiku/<intent>/main". State plainly what you couldn't check and what the human needs to do: confirm CI is green and the review conversation is resolved on the delivery PR, then merge it — once it merges you'll detect that on the next pass (local git) and sign off — or make a provider CLI available/authenticated so you can verify directly.
Set severity: medium. This holds your sign-off (the engine won't stamp delivery-verifier while the finding is open) without spinning the studio fix-hat loop — there is no code defect to fix, and a fixer can't install or authenticate a CLI. It's a hold for the human, not work for a hat.
Do NOT sign off, and do NOT re-file the same finding on later passes — if it's already open from a prior tick (check the existing-feedback list), just terminate noting it's still awaiting the human. When the human merges or grants access, your next run resolves the real way (merge proof, or live CI verification).

Sign-off rule

Terminate clean — which the engine reads as your approval — only when one of these is true:

The branch is merged into mainline (the host's own gate already fired — see "Resolve the delivery PR"); or
You verified the open PR and it's fully clean: CI is green (no failing checks), CI is meaningful (the intent's quality gates are actually run by the pipeline and no green check is a no-op), the PR is mergeable (not draft, no conflicts), and no unresolved, actionable review thread remains; or
There's genuinely nothing to gate — no git remote, or a non-code deliverable with no executable quality gates.

Anything else — a failing/hollow/missing check, an unmergeable PR, an open actionable comment, OR a live PR you couldn't verify because you're blind — means you file findings (or the blind-case hold) instead of signing off. A check you couldn't run is not a check that passed; do not sign off to get unstuck. The fix-hat loop lands the code corrections, the human resolves the blind case, and you run again and re-judge against the new state. Keep doing that until the delivery is genuinely clean — that, and only that, is a delivered intent.

Common failure modes to look for

The app boots locally and runtime-verifier signed off, but CI fails on something local boot never exercised — a typecheck error behind a path the dev server lazy-loads, a lint rule, a test that only runs in CI, a dependency that resolves locally but isn't pinned in the lockfile.
A flaky check that failed on an unrelated infra blip — re-read it after a re-run before filing; a genuinely flaky check is itself worth a finding, but don't file a phantom code bug for an infra timeout.
Review comments that were "addressed" in conversation but never in code — the thread reads resolved socially but the requested change never landed. Verify against the actual diff, not the reply text.
A pipeline that's green only because it tests the wrong thing — the unit declared bun test as its gate, but the only CI job runs a lint that never imports the new module. Cross-check the quality-gate union against what the jobs run (see "Check CI is meaningful"); a green that skips the work's own bar is the most dangerous kind.
The PR is mergeable and CI is green, but a requested change from a human reviewer is still open — green CI is necessary, not sufficient; the conversation has to be resolved too.