Step 05 of 9 3-4 weeks· advanced
Step 5: Data Quality Framework
Build the DQ framework before pipelines ship to production. Schema, freshness, volume, nulls, duplicates, ranges, referential integrity, business invariants.
Recommended prompts
Use one of these to do the work in your IDE
Open the template to read it in full. Click Copy prompt to grab it (with your stack values pre-filled where they apply) — then paste into Claude Code, Cursor, or wherever you build.
Recommended skills
Drop these into Claude Code for this phase
Skills auto-trigger on the right kind of request. Install once; they apply to every prompt that fits.
Recommended MCP configs
Wire these tools into Claude Code first
MCP servers give Claude Code direct access to external systems (Jira, browsers, databases). Configure once.
When you're done
Verify these in your own work before moving on
This is a checklist for you to mentally tick off in your repo and IDE — the site doesn't track it, you do.
- DQ framework deployed and tested
- Standard test patterns documented
- Quarantine pattern working
- DQ reporting dashboard created
- SLOs defined for the first set of tables
- Alerting integrated
- DQ runs as part of every pipeline (not separate)
Common pitfalls
What goes wrong at this step
- Skipping DQ until production — discovers issues in production. Build it in from day 1
- DQ tests that always pass — broken tests give false confidence
- No quarantine pattern — pipeline either fails or ships bad data, no middle ground
- DQ separate from pipelines — drifts out of sync; rules don't apply consistently
- No alerting on violations — silent failures are worst