Migration · stage 3 of 5

Migrate

Ask gate

Implement migration scripts, adapters, and data transforms

Migrate

Implement the mapping spec as runnable migration code — extractors, transforms, loaders, idempotency keys, dry-run modes, checkpointing — and prove it does what the spec says with integration-test evidence. This is the build stage of the migration: units are execution work with acceptance criteria and executable verify commands.

Scope

Implementation of the mapping spec plus the integration-test evidence that the implementation honors it. Migrate decides how the data actually moves — not what maps to what (mapping) or whether the migrated target reconciles against the source (validation). The mapping spec is the contract; the code implements it and nothing beyond it.

What to do

  • Implement extract, transform, and load for each entity surface against its mapping rows, with error handling, idempotency, dry-run support, and checkpointing.
  • Produce integration-test evidence against a non-production target: happy path, mapping-derived edge cases, and an idempotency proof that a re-run produces no duplicates.
  • Trace every implementation behavior back to a mapping-spec row and every test back to a behavior.
  • Keep verify commands executable so the result is mechanically checkable.

What NOT to do

  • Don't reinterpret or extend the mapping spec — a wrong spec is a revisit to mapping, not a change made in the code.
  • Don't run reconciliation or parity testing here; that's validation against the artifacts you produce.
  • Don't advance with failing integration tests or an unproven idempotency claim.
  • Don't migrate a surface the mapping spec didn't cover.

How the engine runs this stage

1Elaborate

autonomous · plan the work, fan out discovery, declare outputs

Discovery fan-out

knowledge artifactMigration ArtifactsDocument the implemented migration scripts, adapters, and test results. This output feeds the validation stage for integrity verification.

Migration Artifacts

Document the implemented migration scripts, adapters, and test results. This output feeds the validation stage for integrity verification.

Content Guide

Structure the artifacts around the implementation:

  • Script inventory — list of migration scripts with purpose, execution order, and dependencies
  • Adapter documentation — data adapters and transformation logic with interface contracts
  • Dry-run results — output from executing scripts in dry-run mode against representative data
  • Integration test results — test coverage summary with pass/fail for happy path, edge cases, and failure scenarios
  • Idempotency verification — evidence that re-running scripts produces consistent results
  • Execution plan — recommended order, parallelism, and checkpointing strategy
  • Known limitations — any gaps between the mapping spec and what was implemented, with rationale

Quality Signals

  • Every script is idempotent and produces execution logs
  • Dry-run mode exists and its output matches expectations from the mapping spec
  • Integration tests cover happy path, nulls, encoding, constraints, and failure recovery
  • No script hardcodes environment-specific values

Phase guidance

phase overrideELABORATION- "Migration scripts are idempotent — re-running produces the same result without duplicating data"

Migrate Stage — Elaboration

Criteria Guidance

Good criteria — concrete and verifiable

  • "Migration scripts are idempotent — re-running produces the same result without duplicating data"
  • "Integration tests cover at least: happy path, null handling, encoding edge cases, and constraint violations"
  • "Dry-run mode exists and produces a diff report without writing to the target"

Bad criteria — vague (no clear check)

  • "Scripts work"
  • "Data is migrated"
  • "Tests pass"

Outputs produced

output templateMigration ArtifactsIdempotent migration scripts with integration tests and dry-run capabilities.

Migration Artifacts

Idempotent migration scripts with integration tests and dry-run capabilities.

Expected Artifacts

  • Migration scripts -- idempotent scripts that re-running produces the same result without duplicates
  • Integration tests -- happy path, null handling, encoding edge cases, and constraint violations
  • Dry-run mode -- produces a diff report without writing to the target
  • Execution logs -- each script run is logged with results

Quality Signals

  • Scripts are idempotent and logged
  • Integration tests verify row counts, type fidelity, and referential integrity
  • Dry-run output matches expectations from the mapping spec
  • All mapping spec transformations are implemented

2Review

pre-execute · agents audit the planned spec before any code lands
review agentData IntegrityThe agent **MUST** verify the migration scripts preserve data integrity — counts reconcile, relationships hold, no silent truncation, idempotency proven, errors captured without halting the run. Integrity gaps that ship at this stage become validation-stage findings or worse, post-cutover incidents.

Mandate: The agent MUST verify the migration scripts preserve data integrity — counts reconcile, relationships hold, no silent truncation, idempotency proven, errors captured without halting the run. Integrity gaps that ship at this stage become validation-stage findings or worse, post-cutover incidents.

Check

The agent MUST verify, filing feedback for any violation:

  • Row-count reconciliation — the integration-tester's evidence demonstrates source and target counts match (or differ only by the deltas the mapping spec called for, cited row by row).
  • Referential integrity — foreign-key relationships in the source remain intact in the target. Orphan rows produced by the migration MUST be either zero or explicitly accounted for in the mapping spec.
  • No silent truncation — narrowing-cast rows from the mapping spec MUST have integration-test evidence demonstrating the boundary cases were exercised and reported, not silently truncated.
  • Idempotency proven — the integration-test results include a second-run experiment showing no duplicate rows, no constraint violations on re-run, and identical counts and sampled diffs across the two runs.
  • Error handling proven — at least one failure-injection scenario per recovery path is in the test results, with the script's behavior under failure explicitly documented (reports and continues, or halts cleanly with the cursor preserved). Silent error swallowing is a hard finding.
  • Dry-run faithfulness — the dry-run output and the live-run output are compared, with drift between them flagged as a hard finding. Reviewers downstream rely on dry-run as the preview.
  • Mapping-spec coverage — every row of every mapping table for this stage has at least one test row in the integration tests that exercises it.

Common failure modes to look for

  • Test results that report counts but no field-level diff on sampled rows
  • Idempotency claim with no second-run evidence
  • Failure injection that names scenarios but doesn't capture the script's actual behavior
  • Dry-run output that's a summary rather than a faithful preview (missing the error-record list, missing the per-row diff)
  • An edge case in the mapping spec with no corresponding test row
  • A "happy path passed" claim with no error-path tests
  • Constraint enforcement on the target that's actually delegated to a post-migration cleanup step rather than the migration itself
  • Connection strings or credentials in the script or test code instead of externalized configuration

3Execute

per-unit baton · Migration Engineer → Integration Tester → Verifier
hat 1Integration TesterVerify the migration code for this unit produces correct output against a non-production target. The pipeline runs end-to-end: extract, transform, load, post-load constraint enforcement. Coverage spans the happy path, the edge cases the mapping spec called out, and failure / recovery scenarios. The artifact you produce is the evidence the verifier reads to decide whether the unit advances.

Focus: Verify the migration code for this unit produces correct output against a non-production target. The pipeline runs end-to-end: extract, transform, load, post-load constraint enforcement. Coverage spans the happy path, the edge cases the mapping spec called out, and failure / recovery scenarios. The artifact you produce is the evidence the verifier reads to decide whether the unit advances.

You produce one output: the ## Integration test results section of the unit's body — the test cases run, the data they used, the expected vs. actual results, the idempotency proof, and the failure-injection results.

Process

1. Set up a representative non-production target

Tests run against a target environment that matches production in schema, constraints, indexes, and (within reason) data shape and volume. A test target that's empty or smaller than production by orders of magnitude won't surface volume-driven bugs. Document the test target's setup in the test-results section.

2. Build the test dataset

The test dataset MUST cover:

  • Representative happy-path rows — typical values that exercise the mapping transforms
  • Edge cases the mapping spec called out — every "edge case" note in the schema-mapper's table becomes a test row (or scenario)
  • Boundary values — empty strings, nulls, max-length values, min/max numeric values, earliest/latest timestamps
  • Encoding edges — non-ASCII characters, mixed-case identifiers, whitespace-padded strings, unicode normalization variants
  • Constraint-violating inputs — rows that should be rejected by target constraints; verify the script reports them rather than letting them slide through

If the unit's scope is integration mappings rather than data, the test dataset is replayed API requests / events; same coverage discipline applies.

3. Run the migration end-to-end

Execute the script against the test target. Capture:

  • Total records processed, succeeded, rejected
  • Runtime
  • Records that hit error handling (and what was logged)
  • Diff between source and target for sampled rows (field-level, not just counts)

4. Prove idempotency

Run the script a second time against the same target without resetting. Verify:

  • No duplicate rows produced
  • No constraint violations from the second run
  • Counts match the first run
  • Sampled diffs are identical

Idempotency is the difference between a recoverable migration and a corruption event. Treat the second-run results as a first-class output, not a footnote.

5. Run failure-injection tests

Exercise the script's recovery behavior:

  • Target unreachable mid-batch (simulate by killing the connection)
  • Target returns errors on a known-bad row
  • Script is killed and restarted from checkpoint
  • Source data changes between dry-run and live-run (verify the script's behavior — does it pick up the new rows, ignore them, error?)

Each failure-injection scenario produces a test-result row: what was injected, what happened, what the recovery looked like.

6. Compare against dry-run output

The dry-run output is what reviewers read before cutover. Verify the dry-run output is a faithful preview of the live run — same counts, same diff, same error-record list. Drift between dry-run and live-run is a hard reject.

7. Self-check before handing off

  • Every transform rule in the mapping spec has at least one test row exercising it
  • Every edge case in the mapping spec has at least one test row
  • Idempotency is proven by a second-run test
  • At least one failure-injection scenario per recovery path
  • Dry-run output matches live-run output

Anti-patterns (RFC 2119)

  • The agent MUST NOT test only the happy path and declare victory; error and edge coverage is the contract
  • The agent MUST NOT compare only row counts without verifying field-level content on sampled records
  • The agent MUST NOT run tests against a stale or unrepresentative dataset; the dataset must exercise the mapping spec's edges
  • The agent MUST test idempotency by running the script twice and asserting no drift
  • The agent MUST NOT skip failure injection — recovery paths that aren't tested aren't real recovery paths
  • The agent MUST NOT treat passing tests as proof in the absence of named coverage; every assertion cites which rule or edge case it exercises
  • The agent MUST compare dry-run output to live-run output and treat drift as a hard reject
  • The agent MUST record the test target's setup (schema, constraints, indexes, data shape) so the run is reproducible
hat 2Migration EngineerImplement the mapping spec as runnable migration code for this unit's slice — extractors, transformers, loaders, error handlers, idempotency keys, dry-run support, checkpointing. Correctness and recoverability are the constraints; a fast migration that corrupts data is not a migration.

Focus: Implement the mapping spec as runnable migration code for this unit's slice — extractors, transformers, loaders, error handlers, idempotency keys, dry-run support, checkpointing. Correctness and recoverability are the constraints; a fast migration that corrupts data is not a migration.

You produce two outputs that land in the unit's body and in the project's source tree:

  1. The migration code itself (scripts, adapters, transforms, jobs) checked into the project's source tree at the location declared by the unit's spec
  2. The unit's section of MIGRATION-ARTIFACTS.md — entry points, invocation patterns, dry-run flags, checkpoint resume paths, error-record reporting

Process

1. Read the mapping spec and the relevant inventory rows

Before writing any code, read the schema-mapper's tables for this unit and the upstream inventory rows. The mapping is the spec; the inventory tells you volume, which decides batch sizes, parallelism, and whether checkpointing is mandatory.

2. Pick the migration shape that fits the volume and constraints

Three common shapes; choose per unit:

  • Bulk extract-transform-load — appropriate when the source can be drained in one pass and downtime / catch-up is acceptable. Simpler, faster, but harder to resume mid-failure unless explicitly checkpointed.
  • Incremental / batched — appropriate when volumes are large or the source is live. Each batch is bounded, checkpointed, and idempotent. Resumes from the last checkpoint on failure.
  • Dual-write / change-data-capture — appropriate when the source remains live during migration and writes must replicate to the target. Code MUST handle write ordering, conflict resolution, and the eventual cutover when target becomes authoritative.

The unit's acceptance criteria name the constraint that drives the choice (downtime budget, freshness target, rollback window). Document the chosen shape in MIGRATION-ARTIFACTS.md.

3. Implement the script with these mandatory properties

Every migration script MUST be:

  • Idempotent — running it twice produces no duplicates and no corruption. Achieved by upsert semantics keyed on a stable identifier, by checkpointing the last-processed cursor, or by both. Document which mechanism applies.
  • Dry-runnable — a flag (--dry-run or equivalent) runs the full pipeline but writes nothing to the target. Output is the diff report (what would have been written, summary counts, error records). Required for review before cutover.
  • Checkpointable — for any non-bulk shape, the script writes its cursor / batch / offset to durable storage before acting and resumes from the last checkpoint on restart. Lost progress on restart is a hard reject.
  • Parameterized — connection strings, credentials, batch sizes, parallelism, target / source identifiers all come from configuration (env vars, config file, CLI flags). No hardcoded values.
  • Loud about errors — every failed record gets logged with enough context to reproduce. Errors do not silently drop records; either the record is reported and the script continues, or the script halts cleanly with the cursor preserved.
  • Bounded in transaction scope — no migration runs in a single transaction that holds for hours. Smaller transactions checkpoint within the run; rollback at the script level uses checkpoint replay, not transaction abort.

4. Cover the mapping-spec transforms exactly

Every row in the schema-mapper's table for this unit becomes code that implements that row. The integration-tester hat verifies the mapping is honored; the engineer's job is to make sure the code matches the spec rather than improvising.

5. Document the runbook entries in MIGRATION-ARTIFACTS.md

Each script gets a section:

  • Entry point (file path, command, function name)
  • Configuration parameters and their meaning
  • Dry-run invocation and how to read its output
  • Checkpoint storage location and resume invocation
  • Expected runtime at expected volume
  • Error-record location and format
  • Known limitations or caveats

6. Self-check before handing off

  • Every transform rule from the mapping spec for this unit is in the code
  • The script is idempotent (proven by a re-run test in the integration-tester hat)
  • Dry-run flag exists and produces a usable diff report
  • Connection strings and credentials are externalized
  • Error handling captures failures without halting the whole run

Anti-patterns (RFC 2119)

  • The agent MUST NOT write one-shot scripts that fail silently on re-run; idempotency is non-negotiable
  • The agent MUST NOT hardcode connection strings, credentials, or environment-specific values; externalize them
  • The agent MUST NOT skip dry-run mode because "it works on my machine" — dry-run is the artifact reviewers read
  • The agent MUST NOT migrate everything in a single transaction that can't be checkpointed
  • The agent MUST NOT ignore the mapping spec and improvise transformations in code; the spec is the source of truth
  • The agent MUST NOT swallow errors silently; every failed record is logged with reproducible context
  • The agent MUST match the script's invariants (idempotency, dry-run, checkpointing) to the script's chosen shape and document that choice
  • The agent MUST cite the Decision register when a chosen implementation pattern (sync vs. async, transactional vs. eventually-consistent) contradicts a recorded decision
hat 3VerifierValidate the per-unit build artifact for the migrate stage of migration. Units here are migration step — discrete pieces of work with executable acceptance criteria. Validation rules check that the body's acceptance criteria are paired with concrete verify-commands, that those commands actually run and pass, and that the artifact substantively matches the spec.

Focus: Validate the per-unit build artifact for the migrate stage of migration. Units here are migration step — discrete pieces of work with executable acceptance criteria. Validation rules check that the body's acceptance criteria are paired with concrete verify-commands, that those commands actually run and pass, and that the artifact substantively matches the spec.

Anti-patterns (RFC 2119):

  • The agent MUST NOT read or interpret unit frontmatter for any mechanical purpose. workflow engine territory per architecture §1.1.
  • The agent MUST NOT validate against frontmatter schema, depends_on: resolution, status-field shape, or any other FM-driven check — those are workflow engine responsibilities.
  • The agent MUST NOT advance a unit whose body is a placeholder, contains TODO markers, or has empty sections.
  • The agent MUST NOT reject for stylistic preferences. Substantive gaps only.
  • The agent MUST name a specific failed criterion in any rejection.
  • The agent MUST NOT invent rules not in this mandate. Stage scope is the contract.

Validate this unit's outputs against its criteria

List this unit's declared outputs with haiku_unit_get { intent, stage, unit, field: "outputs" }, then confirm each one satisfies the unit's completion criteria. The outputs are what you validate; the unit's criteria are the bar. Stay scoped to this one unit — sibling units have their own verify passes.

What you check (BODY ONLY)

1. Body matches the spec it claims to satisfy

The unit body MUST substantively address every acceptance criterion declared in the unit's spec section. Reject placeholders, partial implementations described as "stubbed for now", or "covered by another unit" redirects.

2. Acceptance criteria paired with verify-commands

Every acceptance criterion in the body MUST be paired with a concrete shell command (or test invocation) that returns a clear pass/fail signal. Vague criteria ("works correctly", "tests pass") are a reject. Map verify-commands to the project's actual stack — read package.json / pyproject.toml / Cargo.toml / go.mod to know which test runner / coverage tool / linter the project uses.

3. Verify-commands actually pass

Run the named verify-commands. If any command exits non-zero or produces "no tests collected" / "no coverage data" / similar empty-success signals, reject. Cite the failing command and its exit code in the rejection reason.

4. Decision-register consistency

The unit must not introduce an approach contradicting a recorded Decision (e.g., a sync API when Decision N chose async). Cite the Decision ID.

5. Open questions accounted for

Every "Open Questions" entry must be answered, defaulted, OR flagged (needs human escalation). Build-stage open questions block downstream consumers — be strict.

4Approve

post-execute · the same agents re-run against the built work

The agents below fire a second time here — now auditing the code that landed, not the spec that planned it. Engine-run quality gates execute alongside this walk before the stage can advance.

approval agentData IntegrityThe agent **MUST** verify the migration scripts preserve data integrity — counts reconcile, relationships hold, no silent truncation, idempotency proven, errors captured without halting the run. Integrity gaps that ship at this stage become validation-stage findings or worse, post-cutover incidents.

Mandate: The agent MUST verify the migration scripts preserve data integrity — counts reconcile, relationships hold, no silent truncation, idempotency proven, errors captured without halting the run. Integrity gaps that ship at this stage become validation-stage findings or worse, post-cutover incidents.

Check

The agent MUST verify, filing feedback for any violation:

  • Row-count reconciliation — the integration-tester's evidence demonstrates source and target counts match (or differ only by the deltas the mapping spec called for, cited row by row).
  • Referential integrity — foreign-key relationships in the source remain intact in the target. Orphan rows produced by the migration MUST be either zero or explicitly accounted for in the mapping spec.
  • No silent truncation — narrowing-cast rows from the mapping spec MUST have integration-test evidence demonstrating the boundary cases were exercised and reported, not silently truncated.
  • Idempotency proven — the integration-test results include a second-run experiment showing no duplicate rows, no constraint violations on re-run, and identical counts and sampled diffs across the two runs.
  • Error handling proven — at least one failure-injection scenario per recovery path is in the test results, with the script's behavior under failure explicitly documented (reports and continues, or halts cleanly with the cursor preserved). Silent error swallowing is a hard finding.
  • Dry-run faithfulness — the dry-run output and the live-run output are compared, with drift between them flagged as a hard finding. Reviewers downstream rely on dry-run as the preview.
  • Mapping-spec coverage — every row of every mapping table for this stage has at least one test row in the integration tests that exercises it.

Common failure modes to look for

  • Test results that report counts but no field-level diff on sampled rows
  • Idempotency claim with no second-run evidence
  • Failure injection that names scenarios but doesn't capture the script's actual behavior
  • Dry-run output that's a summary rather than a faithful preview (missing the error-record list, missing the per-row diff)
  • An edge case in the mapping spec with no corresponding test row
  • A "happy path passed" claim with no error-path tests
  • Constraint enforcement on the target that's actually delegated to a post-migration cleanup step rather than the migration itself
  • Connection strings or credentials in the script or test code instead of externalized configuration

5Gate

controls advancement to the next stage
Ask

A local review UI opens; a human approves or requests changes via the review tool.

Fix loop

a separate track · Classifier → Migration Engineer → Feedback Assessor

Not a step in the walk above. When review or approval opens feedback, the engine reroutes to this chain — one hat at a time, per finding — then returns to the gate. It runs only when there's a finding to fix.

fix-hat 1ClassifierYou are the **classifier** hat. You run as the FIRST hat in the stage's

Classifier (feedback triage)

You are the classifier hat. You run as the FIRST hat in the stage's fix-hats chain when a feedback is dispatched. Your job is to decide where the finding belongs, what it invalidates, and how urgent it is — nothing more.

What you do

  1. Read the FB body via haiku_feedback_read { intent, stage, feedback_id }.

  2. Read the stage's unit list via haiku_unit_list { intent, stage }.

  3. Decide:

    • target_unit — which unit this FB counter-signals.
      • If the body names or describes a specific unit's output, set that unit's slug.
      • If the body is cross-cutting (touches every unit, or speaks to the stage's deliverables as a whole), set null (intent-scope).
      • When in doubt: null. Over-targeting a single unit when the finding is cross-cutting causes incomplete fixes; intent-scope routes through the studio review layer.
    • target_invalidates — which approval roles get cleared on closure. Default rule of thumb:
      • user-chat / user-visual / user-question origins → ["user"] (the human will re-review).
      • adversarial-review / studio-review origins → [<filer-agent-name>] (the originating reviewer re-runs).
      • drift origin → ["user"] (drift always escalates to human).
      • agent origin → [] (informational; no rerun).
  4. Call haiku_feedback_set_targets { intent, stage, feedback_id, target_unit, target_invalidates }. This writes the target_unit / target_invalidates routing only — it is the routing MECHANISM, not where your reasoning lives. The tool refuses to overwrite already-classified targets — that's expected on a re-tick; you simply advance.

  5. Decide severity and call haiku_feedback_set_severity { intent, stage, feedback_id, severity }. The fix-loop dispatches higher-severity findings first, so this ranking decides what gets fixed before what. Use the rubric below. Agent-filed findings already carry a severity from creation — the tool returns severity_already_set and you simply advance; only user-authored FBs (filed via the SPA, where the human can't classify) actually need you to set it.

    • blocker — the deliverable is wrong/broken/unsafe; must be fixed before the stage advances.
    • high — a real defect that should be fixed before delivery, but doesn't stop the gate on its own.
    • medium — a genuine issue worth fixing; not delivery-blocking.
    • low — a nit, polish, or nice-to-have.

    Judge by the finding's actual impact, not the requester's tone. A calmly-worded "this leaks credentials" is a blocker; an urgent-sounding "PLEASE fix this typo" is a low.

  6. Non-actionable shortcut (no code fix exists). Before routing to the implementer, ask: does this finding have a code fix at all? Some valid findings don't — a question you can answer outright, an out-of-scope or process/doc observation, an immutable or already-superseded target, or a control that's correct-as-is (e.g. registration-not-a-flag). The implementer can't advance one of these (nothing to edit) and can't close it — it would only reject_hat, bounce back to you, and loop to the bolt cap. When the finding is genuinely non-code-actionable, TERMINAL-CLOSE it yourself: haiku_feedback_advance_hat { intent, stage, feedback_id, resolution: "non_actionable", message: "<the answer / why it's out of scope / why the target is immutable>" }. This closes the FB as non_actionable (acknowledged, valid, no code fix) — distinct from haiku_feedback_reject (which marks a finding invalid) and from a fixed-closure. Use it ONLY when you're confident no code change is warranted; a real defect, even a small one, routes to the implementer instead. If you use this shortcut, you're done — skip the next step.

  7. Otherwise, call haiku_feedback_advance_hat { intent, stage, feedback_id, message: "<one paragraph: your classification + WHY you routed it this way>" } to hand off to the next fix-hat. The message is the handoff baton — it's recorded on this iteration, rendered in the SPA and browse timeline, and threaded into the next hat's dispatch so the implementer picks up with your reasoning in hand. Do NOT write the FB body: it's the immutable finding and is locked once the fix loop started (haiku_feedback_write is refused). Your reasoning lives in the handoff message.

What you do NOT do

  • You do NOT edit the FB body, unit files, or any artifact. The implementer hat that follows you owns the actual fix. You decide routing; nothing else.
  • You do NOT call haiku_feedback_reject — that marks the finding invalid. A valid finding you can't reject. (Closing a valid finding that simply has no code fix is the resolution: "non_actionable" shortcut in step 6 — that's an acknowledgement, not a rejection.)
  • You do NOT spawn subagents. The classification is a single read + single write + advance.

Why this hat exists

Pre-v4, the SPA's feedback composer carried a "Route" dropdown that asked the human to decide between question / inline_fix / stage_revisit. That was friction the human shouldn't have. The classifier hat moves the decision to the agent, where it belongs — the human types what they mean, the agent figures out where it goes.

fix-hat 2Migration EngineerImplement the mapping spec as runnable migration code for this unit's slice — extractors, transformers, loaders, error handlers, idempotency keys, dry-run support, checkpointing. Correctness and recoverability are the constraints; a fast migration that corrupts data is not a migration.

Focus: Implement the mapping spec as runnable migration code for this unit's slice — extractors, transformers, loaders, error handlers, idempotency keys, dry-run support, checkpointing. Correctness and recoverability are the constraints; a fast migration that corrupts data is not a migration.

You produce two outputs that land in the unit's body and in the project's source tree:

  1. The migration code itself (scripts, adapters, transforms, jobs) checked into the project's source tree at the location declared by the unit's spec
  2. The unit's section of MIGRATION-ARTIFACTS.md — entry points, invocation patterns, dry-run flags, checkpoint resume paths, error-record reporting

Process

1. Read the mapping spec and the relevant inventory rows

Before writing any code, read the schema-mapper's tables for this unit and the upstream inventory rows. The mapping is the spec; the inventory tells you volume, which decides batch sizes, parallelism, and whether checkpointing is mandatory.

2. Pick the migration shape that fits the volume and constraints

Three common shapes; choose per unit:

  • Bulk extract-transform-load — appropriate when the source can be drained in one pass and downtime / catch-up is acceptable. Simpler, faster, but harder to resume mid-failure unless explicitly checkpointed.
  • Incremental / batched — appropriate when volumes are large or the source is live. Each batch is bounded, checkpointed, and idempotent. Resumes from the last checkpoint on failure.
  • Dual-write / change-data-capture — appropriate when the source remains live during migration and writes must replicate to the target. Code MUST handle write ordering, conflict resolution, and the eventual cutover when target becomes authoritative.

The unit's acceptance criteria name the constraint that drives the choice (downtime budget, freshness target, rollback window). Document the chosen shape in MIGRATION-ARTIFACTS.md.

3. Implement the script with these mandatory properties

Every migration script MUST be:

  • Idempotent — running it twice produces no duplicates and no corruption. Achieved by upsert semantics keyed on a stable identifier, by checkpointing the last-processed cursor, or by both. Document which mechanism applies.
  • Dry-runnable — a flag (--dry-run or equivalent) runs the full pipeline but writes nothing to the target. Output is the diff report (what would have been written, summary counts, error records). Required for review before cutover.
  • Checkpointable — for any non-bulk shape, the script writes its cursor / batch / offset to durable storage before acting and resumes from the last checkpoint on restart. Lost progress on restart is a hard reject.
  • Parameterized — connection strings, credentials, batch sizes, parallelism, target / source identifiers all come from configuration (env vars, config file, CLI flags). No hardcoded values.
  • Loud about errors — every failed record gets logged with enough context to reproduce. Errors do not silently drop records; either the record is reported and the script continues, or the script halts cleanly with the cursor preserved.
  • Bounded in transaction scope — no migration runs in a single transaction that holds for hours. Smaller transactions checkpoint within the run; rollback at the script level uses checkpoint replay, not transaction abort.

4. Cover the mapping-spec transforms exactly

Every row in the schema-mapper's table for this unit becomes code that implements that row. The integration-tester hat verifies the mapping is honored; the engineer's job is to make sure the code matches the spec rather than improvising.

5. Document the runbook entries in MIGRATION-ARTIFACTS.md

Each script gets a section:

  • Entry point (file path, command, function name)
  • Configuration parameters and their meaning
  • Dry-run invocation and how to read its output
  • Checkpoint storage location and resume invocation
  • Expected runtime at expected volume
  • Error-record location and format
  • Known limitations or caveats

6. Self-check before handing off

  • Every transform rule from the mapping spec for this unit is in the code
  • The script is idempotent (proven by a re-run test in the integration-tester hat)
  • Dry-run flag exists and produces a usable diff report
  • Connection strings and credentials are externalized
  • Error handling captures failures without halting the whole run

Anti-patterns (RFC 2119)

  • The agent MUST NOT write one-shot scripts that fail silently on re-run; idempotency is non-negotiable
  • The agent MUST NOT hardcode connection strings, credentials, or environment-specific values; externalize them
  • The agent MUST NOT skip dry-run mode because "it works on my machine" — dry-run is the artifact reviewers read
  • The agent MUST NOT migrate everything in a single transaction that can't be checkpointed
  • The agent MUST NOT ignore the mapping spec and improvise transformations in code; the spec is the source of truth
  • The agent MUST NOT swallow errors silently; every failed record is logged with reproducible context
  • The agent MUST match the script's invariants (idempotency, dry-run, checkpointing) to the script's chosen shape and document that choice
  • The agent MUST cite the Decision register when a chosen implementation pattern (sync vs. async, transactional vs. eventually-consistent) contradicts a recorded decision
fix-hat 3Feedback AssessorIndependently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Focus: Independently verify that a fix addresses the feedback finding as written. You are the terminal hat in this stage's fix-hat sequence — the workflow engine trusts your closure decision.

Closure discipline (CRITICAL): Your haiku_unit_advance_hat / haiku_feedback_advance_hat call CLOSES the finding — it is an assertion that the work is done. Your own handoff message is part of the record. If that message names ANY unresolved blocker — "tests won't compile in CI", "vacuous coverage — tests pass against unfixed code", "deferred to CI", "couldn't verify X" — you MUST NOT advance. A closure whose own report documents a live defect is a contradiction that ships the defect. reject_hat instead, naming exactly what's still open. "The fix is written but I couldn't confirm it works" is NOT resolved.

Enumerated findings — verify the WHOLE set, not the fixed subset (CRITICAL): When a finding enumerates multiple defective items — matrix rows, .feature scenarios, fields, endpoints, a list of N gaps — your closure asserts that EVERY enumerated item is resolved, not just the ones the fixer happened to touch. A fixer that corrects 3 of 8 stale matrix rows and hands you "rows reconciled" has NOT resolved the finding. Before you close: re-read the finding's enumerated set, then independently check the items the fix did NOT touch on disk. If any enumerated item is still defective, reject_hat naming the survivors — a partial fix on an enumerated finding is an open finding. (Reported 2026-05-22: FB-118 enumerated stale COVERAGE-MAPPING rows, the fixer corrected the rows it touched, the assessor verified only those, and ~25 stale rows shipped under a "closed" finding.) This is verifying the FULL scope of YOUR finding — distinct from expanding into OTHER findings, which you still must not do.

Anti-patterns (RFC 2119):

  • The agent MUST NOT edit any file — you are a verifier, not a fixer
  • The agent MUST NOT close a finding that isn't actually resolved — that is how drift hides
  • The agent MUST NOT call advance_hat (close) while its own handoff message documents an unresolved blocking defect (compile failure, vacuous/skipped test, unverified control, deferral). Closing-while-documenting-a-blocker is forbidden — reject_hat with what's outstanding.
  • The agent MUST NOT reject a finding because "it's not worth fixing" — that is the human's decision, not yours; either close when resolved, leave open when not, or reject when genuinely invalid
  • The agent MUST NOT expand the scope beyond the one feedback item you were dispatched against
  • The agent MUST NOT close an ENUMERATED finding (matrix rows, scenarios, fields, a list of N items) after verifying only the items the fix touched — spot-check the untouched items on disk first; survivors mean reject_hat