“Zero Copy” Doesn’t Mean “No Copies.” It Means “No Unmanaged Copies.”

The rallying cry of modern data platforms—Zero Copy—is revolutionary because it flips the default: don’t move data unless there’s a good reason and the platform manages it for you. In Microsoft Fabric, that starts with in-place access via OneLake Shortcuts and an open storage layer, then selectively uses managed and automated copies (like Mirroring and Materialized Lake Views) when they deliver clear value. The result is less sprawl, more trust, and faster analytics—without hand-built duplication. 

Start with in‑place access (the real “zero copy”)

Fabric’s OneLake Shortcuts let you unify data across clouds and accounts without copying bytes, presenting external data through a single namespace while OneLake centrally handles credentials and permissions. Because engines read common, open Delta/Parquet, the same data can power SQL, Spark, and BI without format-specific copies. This is “zero copy” in its purest form: virtualization instead of duplication. 

Even Fabric’s External Data Sharing across tenants actually is “zero copy” because it’s in‑place: Fabric creates a Shortcut in the recipient tenant that points back to the origin; no dataset is duplicated. This reinforces the idea that copies should be intentional platform artifacts, not side‑effect exports. 

Then add managed copies when they buy you something

There are moments when creating a physical artifact is the right move—so long as the platform, not a person, owns the copy.

  • Fabric Mirroring continuously replicates operational databases or external warehouses into OneLake with low latency. You get analytics-ready Delta tables—kept current by the service—without writing CDC pipelines or juggling schedules. It’s a copy, yes, but a managed one with lineage, governance, and predictable behavior.
  • Materialized Lake Views (MLVs) precompute transformations and store the results in OneLake. They accelerate queries (and help you implement a medallion-style architecture) while the platform handles refresh and dependency tracking. Again, you gain performance without proliferating ad‑hoc extract tables.

Why “Zero Unmanaged Copies” hits the mark

Reframing the principle this way keeps the spirit—and clarifies the practice:

  • Default to virtualization: Reach data where it lives (Shortcuts), reuse it across engines (open Delta), and keep security centralized. 
  • Copy with intent: Use Mirroring for availability/latency and MLVs for performance. These are cache‑like artifacts: reproducible, governed, disposable.

Let the platform own the lifecycle. Refresh, lineage, permissions, and cleanup belong to Fabric services, not bespoke scripts that breed shadow datasets. 

When your source is snapshots: making SCDs fit “Zero Unmanaged Copies”

Snapshot‑style sources (daily “full exports” with no reliable CDC) are where “Zero Copy” can go off the rails—teams often warehouse every snapshot to diff them later, creating piles of unmanaged data. The better pattern is to land each snapshot once in OneLake and let the platform compute the deltas, so your Slowly Changing Dimensions (SCDs) get their history without you hand‑crafting duplicate datasets. Fabric gives you two clean, managed options:

1) Mirror or land the snapshot once; use Delta Change Data Feed to derive row‑level changes.

If you can use Fabric Mirroring, it continuously replicates source tables into OneLake as Delta, keeping them current without custom pipelines—perfect as the authoritative “current state” landing for dimensions. Each write versions the table. With Delta Lake Change Data Feed (CDF) enabled, you can read only what changed between versions (inserts, updates, deletes) rather than reprocessing full snapshots. Feed those changes into an SCD Type 2 MERGE that closes the prior version (sets end_date / is_current = false) and inserts the new version—all from a single managed landing copy. 

2) If your upstream is strictly snapshot‑only, still avoid hoarding copies—version the same table and compare “then vs. now.”

Write each day’s snapshot as a new version of the same Delta table (or partition it by snapshot_date under one table), then compute differences either by reading CDF between the last two versions or by using Time Travel to read the previous version and join to today’s. You still maintain one authoritative landing table; the “history” lives in the Delta log, not in a sprawl of duplicate files. (Tune retention appropriately if you depend on longer look‑back windows.) 

Building the SCD the managed way

  • Change detection: Decide which attributes constitute a business‑meaningful change (often a hash of tracked columns). Use CDF as your incremental feed into the SCD MERGE so you only process rows that actually changed. 
  • SCD mechanics: Fabric supports SCD Type 2 patterns in low‑code Dataflows Gen2 or with SQL/Spark MERGEs, so you can standardize start/end timestamps, current flags, and surrogate keys without bespoke staging copies. 
  • Acceleration without duplication: If your SCD powers many queries, add Materialized Lake Views (MLVs) as managed accelerators for the “current” view (e.g., is_current = true) or common rollups. The service owns refresh, lineage, and scheduling—still zero unmanaged copies. 

Why this still honors “Zero Copy”

One landing table, many versions. The only persistent artifact you own is the authoritative Delta table; history is a managed aspect of that table (CDF/time‑travel), not a pile of side tables.  Platform‑owned movement. When movement is required (mirroring, materialization), Fabric owns it—governed, observable, and reversible—so you get performance and history without shadow datasets. 

Even when the source only hands you snapshots, SCDs can live comfortably inside the “Zero Unmanaged Copies” framework—by versioning a single landing table, reading change feeds (or time‑traveling) to get just the deltas, and letting Fabric’s managed features (Mirroring, MLVs) do the heavy lifting.  

A practical rubric for teams

  • First choice: Shortcuts or external sharing (no physical copies).
  • When needed: Mirroring for near‑real‑time analytics on operational data; MLVs for expensive joins/aggregations.
  • Always: Treat any physical artifact as an implementation detail of the platform, not a new source of truth.

Putting it all together

Zero Copy was never a vow that “no bytes will ever be duplicated.” It’s a commitment that you won’t be the one duplicating them. With Shortcuts, Mirroring, and MLVs working together, Fabric delivers the benefits people want from Zero Copy—speed, simplicity, and governance—by eliminating unmanaged copies and elevating managed ones to first‑class, automated features. 

Unknown's avatar

Author: Jason Miles

A solution-focused developer, engineer, and data specialist focusing on diverse industries. He has led data products and citizen data initiatives for almost twenty years and is an expert in enabling organizations to turn data into insight, and then into action. He holds MS in Analytics from Texas A&M, DAMA CDMP Master, and INFORMS CAP-Expert credentials.