← Back to Series Overview
Learn the Pattern · Part 3

ETL vs ELT: Where Transformation Lives

The entire debate comes down to one question — do you clean the data before it lands, or after? Cheap cloud storage flipped the default answer.

In 60 seconds

ETL and ELT use the same three letters. The only thing that moves is WHERE the "T" happens.

  1. ETL = Extract → Transform → Load. Clean the data before it enters the warehouse. Only polished data lands.
  2. ELT = Extract → Load → Transform. Load the raw data first, then transform it inside the warehouse with SQL.
  3. Why the flip — cloud storage got cheap, so keeping raw data is fine.
  4. Reprocessing is easy — new business logic? Re-run the transform on the raw data you kept.
  5. Transforms become code — versioned, tested SQL (dbt) instead of a black-box tool.

The Same Letters, Reordered

For decades, storage was expensive, so it made no sense to keep raw data you'd throw away. The ETL pattern was born of that constraint: a dedicated tool transformed and cleaned data in a separate staging area, and only the finished, modeled result was loaded into the warehouse. The raw data was usually discarded.

Then cloud warehouses arrived with cheap storage and nearly unlimited, elastic compute. Suddenly it was cheaper and far more flexible to load the raw data first and transform it in place. That reordering — ELT — is now the default pattern of the modern stack.

ETL — Transform First
  • Transformation happens outside the warehouse
  • Only clean, modeled data lands
  • Raw data usually discarded
  • Hard to reprocess with new logic
  • Era of expensive storage
ELT — Load First
  • Raw data lands in a Bronze/staging layer
  • Transformations run inside the warehouse
  • Raw data is kept and replayable
  • Easy to reprocess with new logic
  • Era of cheap storage + elastic compute

Why ELT Won (Mostly)

1. Keep the Raw, Replay Anytime

Because raw data is preserved, a change in business logic doesn't mean re-ingesting from the source — you just re-run the transformation. Found a bug in last quarter's revenue calculation? Fix the SQL and rebuild from the raw data you already have.

2. Transformations as Tested Code

ELT pushed transformation into SQL that lives in version control. Tools like dbt turned transformations into a software-engineering discipline: modular models, automated tests, documentation, and lineage — instead of opaque drag-and-drop mappings in a legacy tool.

3. When ETL Still Makes Sense

ELT isn't universal. When you must mask or drop sensitive data before it ever lands (PII, compliance), or when a source feeds many systems and should be cleaned once at the edge, transforming first still wins. The honest rule: choose by where it's cheapest and safest to transform — not by fashion.

ELT, Visualized

ELT — load raw first, transform in place 📦 Sources DBs · APIs · Files 📥 Extract + Load load raw first 🥉 Raw / Bronze kept & replayable ⚙️ Transform → Gold SQL in-warehouse 📊 BI · ML · Apps business value

ELT: load raw first, transform in place — so the raw copy stays replayable.

Same Pattern, Every Platform

The PatternSnowflakeDatabricksBigQueryMicrosoft Fabric
Load rawSnowpipe / COPY INTOAuto LoaderStorage Write API / load jobsData Factory pipelines / Dataflows Gen2
Transform (ELT)dbt / SQLdbt / Spark SQL / DLTdbt / SQLNotebooks / Dataflows Gen2 / SQL
Raw landing zoneBronze schema / stageBronze (Delta)Raw datasetLakehouse Bronze (OneLake)

Feature names evolve — treat this as a capability map, and confirm specifics against current vendor docs.

The takeaway: ETL vs ELT isn't a tooling war — it's a decision about where the "T" runs. On any modern platform the default is ELT: land raw, transform in place with versioned SQL. Learn that pattern and every "which tool?" question gets simpler.

← Back to Publications