← Back to Series Overview
Learn the Pattern · Part 12

Governance, Catalog & Lineage

The layer that turns a data swamp into a trusted asset — answering who can see it, what it means, and where every number came from.

In 60 seconds

As soon as more than one team uses data, three questions appear. Governance answers all three.

  1. WHO can see this? — access control: the right people, the right rows and columns.
  2. WHAT does this column mean? — a catalog: a searchable inventory with definitions, owners, and tags.
  3. WHERE did this number come from? — lineage: the map from source → transformation → dashboard.
  4. Plus quality & compliance — SLAs, PII handling, audit trails.
  5. The payoff — people trust the data. Skip it and every number sparks a debate.

The Layer That Wraps Everything

Across this series, governance has been the layer that wraps all the others (Part 1). It isn't a step in the pipeline — it's the connective tissue of trust. The moment data is shared beyond the person who built it, three questions become unavoidable: who's allowed to see it, what it actually means, and where it came from. Governance is the discipline of answering those questions at scale.

The Three Pillars

1. Access Control — Who

The right people see the right data — and only that. Modern governance goes beyond table-level grants to row-level and column-level security (a regional manager sees only their region; analysts see a masked version of PII). Centralizing these policies — rather than scattering them across systems — is what makes access both safe and auditable.

2. Data Catalog — What

A catalog is a searchable inventory of every dataset: definitions, owners, descriptions, tags, freshness, and popularity. It's how a new analyst answers "is there already a table for monthly revenue, and can I trust it?" without asking around for a week. The catalog turns a sprawling platform into something discoverable.

3. Data Lineage — Where

Lineage is the end-to-end map of how data flows: this dashboard number comes from this Gold table, built from these Silver models, derived from these raw sources. When a figure looks wrong, lineage lets you trace it back to the cause in minutes; when a source changes, it shows you every downstream asset that's affected. It's the difference between confident answers and educated guesses.

4. Quality, Compliance & Audit

Governance also carries the obligations: data-quality SLAs (Part 10), handling of regulated data (PII, GDPR, HIPAA), retention policies, and audit trails of who accessed or changed what. Increasingly this is framed as data products with explicit owners and contracts — accountability, not just access.

Lineage & the Governance Wrapper, Visualized

Data flows left to right; governance (access · catalog · lineage) wraps the entire chain:

🛡️ Governance wraps the whole chain Access control (who) · Catalog (what) · Lineage (where) 🗄️ Source operational data ⚙️ Pipeline transform & model 🏛️ Gold Table business-ready 📊 Dashboard trusted number

Lineage traces the trusted number all the way back to its source; governance wraps every hop.

Same Pattern, Every Platform

The PillarSnowflakeDatabricksBigQueryMicrosoft Fabric
Governance hubHorizon CatalogUnity CatalogDataplexMicrosoft Purview
Access controlRBAC · row/column policiesUnity Catalog grantsIAM · policy tagsPurview · workspace roles
LineageBuilt-in lineageUnity Catalog lineageDataplex lineagePurview lineage

Feature names evolve — treat this as a capability map, and confirm specifics against current vendor docs.

The takeaway — and the close of the series: who / what / where is the governance pattern on every platform; only the product names change (Horizon, Unity Catalog, Dataplex, Purview). That's twelve patterns now that outlive any single platform. Learn the pattern, not the product.

← Back to Publications