Azure Data Platform Modernization Playbook
Build or modernize a cloud data platform on Azure: ADF/Fabric pipelines, PySpark transformations, Synapse/Fabric warehouse, BI consumption, and data governance.
Tell us what you're building
We use these answers to surface the prompts, skills, and MCP configs that fit your stack — and to substitute stack values like {{database}} into the prompts you copy. Content (PRDs, code, etc.) stays in your repo. Everything you enter here is stored in your browser — nothing is sent to a server.
Steps you'll go through
- 01
Discovery and Source Inventory
4-6 weeksProfile every major source system, document quality issues, identify downstream consumers, and capture stakeholder pain points. This is a stakeholder conversation, not just a tech audit.
- 02
Target Architecture Design
3-4 weeksMake the Fabric vs Synapse vs Hybrid call. Design the lakehouse (medallion). Decide storage layout, compute strategy, orchestration, consumption layer, and governance — then capture as ADRs.
- 03
Pipeline Framework Design
4-6 weeksBuild the metadata-driven framework before building any specific pipelines. Multiplier work — done right, every subsequent pipeline is faster.
- 04
PySpark Transformation Framework
3-4 weeksEstablish PySpark standards before writing many notebooks. Shared utility library, Bronze / Silver / Gold templates, performance patterns, testing approach.
- 05
Data Quality Framework
3-4 weeksBuild the DQ framework before pipelines ship to production. Schema, freshness, volume, nulls, duplicates, ranges, referential integrity, business invariants.
- 06
First Domain (Vertical Slice)
8-12 weeksMigrate / build one complete business domain end-to-end (Customer or Sales is typical). Bronze + Silver + Gold + DQ + a Power BI semantic model. Validates everything from Phases 3-5 against a real use case.
- 07
Continue Expanding Domains
1-3 quartersBuild out remaining domains, refining patterns. Common slicing: by business domain, by source system, by criticality, or by region.
- 08
Migration of Consumption (BI, ML, Downstream Apps)
1-2 quarters (often parallel with Phase 7)Switch consumers from legacy data sources to the new platform. Power BI, Tableau, ML pipelines, custom apps, Excel queries — each gets a different migration path.
- 09
Decommission Legacy + Optimize
4-8 weeks for decommission, ongoing for optimizationTurn off old systems. Run cost optimization (typically 30%+ savings opportunity). Tune performance. Mature governance.