Validation

Ask review

Validate data quality, schema compliance, and business rules

Hats
2
Review Agents
1 +1
Review
Ask
Unit Types
Validation
Inputs
Transformation

Dependencies

Transformationmodeled-data

Hat Sequence

1

Data Quality Reviewer

Focus: Review the validation suite for coverage completeness and assertion quality. Verify that tests cover all critical data paths, that thresholds are appropriately tight, and that failure modes produce actionable diagnostics rather than opaque errors.

Produces: Coverage assessment identifying gaps in the validation suite, threshold recommendations, and a verdict on whether the pipeline is safe to deploy.

Reads: Validator's test suite, transformation logic, data model documentation, SLA requirements from discovery.

Anti-patterns (RFC 2119):

  • The agent MUST NOT rubber-stamp a validation suite without tracing coverage back to requirements
  • The agent MUST NOT accept row count checks as sufficient without uniqueness and referential integrity tests
  • The agent MUST verify that validation failures produce enough context to diagnose the root cause
  • The agent MUST NOT ignore SLA-related validations (freshness, completeness percentages)
  • The agent MUST NOT treat validation as a gate to pass rather than a safety net to maintain
2

Validator

Focus: Build and run data quality checks that verify schema compliance, referential integrity, uniqueness, accepted value ranges, row count reconciliation, and business rule correctness. Every assertion should be specific, automated, and produce a clear pass/fail/warning result.

Produces: Validation suite with per-table and per-column assertions, row count reconciliation between source and target, and business rule edge case tests.

Reads: Modeled data from transformation, source catalog from discovery, business rules from the intent.

Anti-patterns (RFC 2119):

  • The agent MUST NOT write only "happy path" tests without edge case coverage
  • The agent MUST NOT check row counts without also checking for duplicates
  • The agent MUST NOT validate schema structure but not actual data values
  • The agent MUST NOT use overly loose thresholds that mask real quality issues
  • The agent MUST distinguish between blocking failures and non-blocking warnings

Review Agents

Coverage

Mandate: The agent MUST verify validation rules cover all data quality dimensions.

Check:

  • The agent MUST verify that schema compliance checks cover all fields (type, format, range, nullability)
  • The agent MUST verify that business rule validations match the transformation specifications
  • The agent MUST verify that row count reconciliation between source and target is performed
  • The agent MUST verify that sample-based spot checks verify actual data values, not just structure

Included from other stages

Validation

Criteria Guidance

Good criteria examples:

  • "Data quality checks cover uniqueness, not-null constraints, referential integrity, and accepted value ranges for every target table"
  • "Row count reconciliation between source and target is within the agreed tolerance (e.g., < 0.1% variance)"
  • "Business rule tests verify at least 3 known edge cases per critical transformation (e.g., timezone handling, currency conversion, null propagation)"

Bad criteria examples:

  • "Data quality is validated"
  • "Tests pass"
  • "Business rules are checked"

Completion Signal (RFC 2119)

Validation suite MUST cover schema compliance, uniqueness, referential integrity, accepted value ranges, and row count reconciliation. Business rule tests verify edge cases. Data quality reviewer MUST have confirmed test coverage is sufficient and all critical paths have assertions. Validation results are logged with pass/fail/warning status per check.