The Scavenger Hunt
By Jason Waldrip
Back in January we shipped AI-DLC — our first real attempt at making AI agents do actual engineering work, not just write code snippets. It was a methodology built on skills: big, structured prompts you'd load into the agent at the start of a task, describing the whole workflow in one pass. The agent would read the skill, internalize it, and run. Researcher hat, planner hat, builder hat, reviewer hat. Units with clear success criteria. Quality gates. It was an honest attempt to give a language model the scaffolding it needed to do sustained, multi-step engineering instead of one-shot tasks.
For a while, it worked. Then it didn't. Context compacted. The agent drifted. Six units in, it would forget what the skill told it in paragraph two. Someone hit /clear and the entire plan evaporated. We saw the same failure mode over and over: the agent couldn't hold the whole plan in its head, and the moment it tried, we lost the plan.
H·AI·K·U is what AI-DLC became when we stopped trying to fix that and rewrote the whole thing. Same goal — structured, multi-agent engineering work with quality gates, review cycles, and human checkpoints. Completely different mechanism. Instead of handing the agent a map, H·AI·K·U makes it run a scavenger hunt: ask for the next clue, execute it, come back, ask for the next clue. The agent never sees the whole map. It doesn't need to.
This post is about what falls out of that flip — the architecture, the state machine, the hooks, and the two places where the user is still in the loop.
If you'd rather click than read, every diagram below has a live counterpart. The runtime architecture map renders the whole system — actors, hooks, workflow phases, every state write — and every node opens its real definition. Each studio gets its own view, from software to legal to hwdev. The post is the story; the map is the ground truth.
The download vs. the scavenger hunt
The easiest way to see the flip is to put the two side by side.
AI-DLC front-loaded everything. One big prompt, delivered once, and then the agent was on its own. H·AI·K·U inverts that. The agent sees one clue at a time — start_unit, gate_review, fix_quality_gates — executes it, and comes back for the next one. The map only exists in the orchestrator and the workflow files on disk.
That's why a H·AI·K·U run survives context compaction, /clear, crashes, and even a different agent picking it up mid-flight. The state isn't in the conversation. It's in the filesystem.
The two approaches diverge on the axis that does most of the work.
DivergeWhere the plan lives
Front-loaded. The entire skill ships into context at the start of the run, and the agent's head is the state. Lose the context, lose the plan.
Drip-fed. One action per tick, served by the orchestrator. State lives on disk; the agent is stateless between ticks. Lose the context, the next turn rehydrates from the filesystem.
That single shift drags a handful of other shifts with it. If the orchestrator owns the plan, it can also own control flow, enforcement, and parallelism — three things skills had to ask the agent nicely about.
- Control flow. The orchestrator picks the next move; the agent does it. No "agent decides what's next."
- Enforcement. Guardrails physically block violations instead of trusting the agent to follow the prompt.
- Parallelism. Units run as waves in isolated worktrees, not as a single agent on a single thread.
- Checkpoints. The user is forced into the loop at elaboration and review gates, and the workflow refuses to skip them.
- Failure mode. A misstep produces "no, do X next tick" rather than an agent that quietly drifts off the rails.
The full architecture
Five layers: the user, the Claude Code harness (agent + hooks), the H·AI·K·U MCP server, the workflow state on disk, and external systems — git worktrees, the review UI, quality gates.
The orchestrator is the brain. Hooks are spinal reflexes. The orchestrator decides what should happen next. Hooks make sure the agent physically can't do anything else between orchestrator calls. The agent is the body that executes the current action. And everything the orchestrator knows, it knows from reading files on disk. The static version is above; the interactive runtime map is the live ground truth — every actor, hook, and artifact is clickable and opens its real definition.
Stages and phases
Every studio is a named lifecycle template — a DAG of stages, each running its own phase workflow. H·AI·K·U ships with two dozen studios, covering everything from software to legal to incident response. The shape is the same across all of them; only the stage names and gate types differ. The default software studio runs six stages in strict order: inception → design → product → development → operations → security. Every stage walks the same phase sequence: pending → elaborate → execute → review → gate. The orchestrator enforces the transitions; the agent can only move forward when the preconditions are actually met, not when it thinks they are.
Two seams demand a human:
Elaboration conversation
Collaborative stages require at least three back-and-forth turns with the user before units can be finalized. Until the conversation hits the threshold, the orchestrator refuses to advance no matter what the agent does.
Review gates
When a stage is ready for review, the orchestrator blocks until the user clicks Approve, Request Changes, or Open PR in the web UI. The agent can't skip past it. There's no clever way around a tool call that just refuses to return.
Everything else is the agent in a tight loop: call haiku_run_next, do the work the action described, call haiku_run_next again. No meetings. No status updates. No "what should I do next?" Just clues.
The full state machine — six stages × five phases, every gate type, mode resolution, /haiku:revisit back-edges — is the engine. You don't have to take my word for it: the runtime architecture map lays out every phase, every action the agent sees, every validation check, and every state write per stage. Toggle between continuous / discrete / hybrid / autopilot modes in the top bar to see how the gates change shape.
Where hooks come in
Hooks run inside the agent harness — not inside the agent's reasoning. They exist because the agent can't be trusted to enforce its own rules. That's not an insult. It's a design decision. If you have to trust the agent to follow the rules, you'll find out it didn't the moment something goes wrong.
Block bad moves
Guardrails fire before any tool runs. They catch direct edits to workflow-managed files, reject malformed prompts, refuse to skip the bolt counter, and stop the agent from straying outside the active unit. The agent is told to use the right tool instead.
Keep the agent honest
Context injection fires at the boundary too. The workflow state, the active unit, the right hat, the budget — all of it lands in the agent's view automatically, so the agent never has to remember and never has to ask. Outputs the agent produces get tracked the moment they're written.
Why this works
There's a story I keep coming back to. A senior engineer once told me, "the train can only move as fast as the tracks that it's built on." You can put any engine you want on top of bad rails. It doesn't matter. The rails are the ceiling. Agent frameworks spent years trying to put a bigger engine on the same bad rails — longer context windows, better prompts, more clever reasoning. And the rails kept failing in the same places: memory, state, recovery, enforcement.
H·AI·K·U is a bet on the rails. The engine doesn't need to be smarter. It needs to be asked the right question at the right time, and it needs to be stopped when it tries to do the wrong thing. The filesystem is the memory. The orchestrator is the brain. The hooks are the reflexes. The agent is the body. And the user, when the user is needed, is called into exactly the right place — not because the agent remembered to ask, but because the workflow engine refuses to advance until they do.
The agent never sees the whole map. It doesn't need to. It just needs the next clue.
Runtime architecture map
The full map — actors, hooks, workflow phases, MCP tools, knowledge pool, pre- and post-intent flows. Every chip clicks through to real source. Open the map →