Mitigate
Ask reviewApply immediate fixes to stop the bleeding — rollbacks, feature flags, scaling
Dependencies
Hat Sequence
Mitigator
Focus: Apply the fastest safe action to stop user-facing impact — rollback, feature flag, scaling, or hotfix. Speed matters, but so does not making things worse. Every action must be reversible.
Produces: Mitigation log documenting exactly what was done, when, and how to reverse it.
Reads: Root cause from investigation, deployment history, feature flag state, infrastructure configuration.
Anti-patterns (RFC 2119):
- The agent MUST NOT apply a fix without a rollback plan for the fix itself
- The agent MUST NOT choose a permanent fix when a faster temporary mitigation exists
- The agent MUST document the exact commands or config changes applied
- The agent MUST NOT make multiple changes simultaneously, making it impossible to attribute which one helped
- The agent MUST NOT skip communication — stakeholders need to know what's being done
Verifier
Focus: Confirm the mitigation actually stopped the user-facing impact. Use the same signals that detected the incident — if error rates triggered the alert, error rates should confirm the fix. Trust metrics, not assumptions.
Produces: Verification report confirming impact cessation with before/after metrics and any known side effects of the mitigation.
Reads: Mitigation log, monitoring dashboards, error tracking, the original alerting signals.
Anti-patterns (RFC 2119):
- The agent MUST NOT declare "fixed" based on a single data point or gut feeling
- The agent MUST NOT use different metrics to verify than the ones that detected the problem
- The agent MUST wait long enough for metrics to stabilize before confirming
- The agent MUST NOT ignore partial mitigation — impact reduced but not eliminated
- The agent MUST check for side effects introduced by the mitigation itself
Review Agents
Safety
Mandate: The agent MUST verify the mitigation stops the bleeding without introducing new risks.
Check:
- The agent MUST verify that mitigation addresses the immediate impact, not a side effect
- The agent MUST verify that rollback or feature flag changes do not break other functionality
- The agent MUST verify that the mitigation is verified working in production, not just deployed
- The agent MUST verify that no data loss or corruption results from the mitigation action
Mitigate
Criteria Guidance
Good criteria examples:
- "Mitigation action is documented with exact commands or config changes applied"
- "Verification confirms user-facing impact has stopped, measured by the same metrics that triggered the incident"
- "Rollback plan exists in case the mitigation itself causes regression"
Bad criteria examples:
- "Issue is mitigated"
- "Fix is applied"
- "Things are back to normal"
Completion Signal (RFC 2119)
Mitigation log documents exactly what was done — rollback version, feature flag toggled, scaling action, or hotfix applied — with timestamps. Verifier MUST have confirmed the user-facing impact MUST have stopped using the same signals that detected the incident. A rollback plan for the mitigation itself MUST be documented. Any known side effects of the mitigation are called out.