In Parts 1–2, we framed a data product as a reusable, self‑contained package that bundles data, metadata, access methods, and governance to deliver an outcome—discoverable, interoperable, and managed like software. We also separated foundational (stable, domain‑anchored) from derived (composed/enriched for specific use‑cases) and showed how composition is the workhorse of value delivery.
This third part makes that guidance concrete on Microsoft Fabric: why Fabric is a natural home for data products, which product types you can expose, and how to govern and compose them—including zero‑copy patterns and two near‑term preview capabilities: Materialized Lake Views and Shortcut Transformations.
Why Fabric is a natural home for data products
One logical lake, one copy, many engines. Fabric’s OneLake gives your tenant a single logical lake with Delta Lake at the core, so Spark, SQL, Warehouse, KQL, and Power BI can all read/write the same open tables. That keeps the data product’s “contract” stable while different personas use their native tools.
Virtualize first with Shortcuts (across clouds). OneLake Shortcuts let you reference data in ADLS, S3, GCS, or other OneLake locations without copying, creating a unified namespace your engines can query immediately—key to zero‑copy publishing and cross‑domain composition.
Share across tenants without copies. External Data Sharing creates a read‑only cross‑tenant Shortcut to your OneLake path, so partners consume your product in place—again, no duplication. (Admins must enable it in both tenants.)
Real‑time + batch, same lake. In Real‑Time Intelligence (Eventhouse/KQL), OneLake availability exposes streaming tables to OneLake in Delta format—with no extra storage charge—so BI/ML can consume the same event data that KQL queries operationally.
Semantic performance path. Direct Lake semantic models read Delta tables directly from OneLake for interactive analysis at scale—ideal for turning foundational tables into governed, derived metric products without import/ETL debt.
Governance is built‑in. OneLake Catalog (Explore/Govern tabs) plus Purview hub centralize discoverability, endorsements, sensitivity labeling, domains, and posture insights—so data products are findable and consistently protected. OneLake security (preview) adds table/folder, row‑level (RLS), and column‑level (CLS) rules directly on lake tables to serve restricted, zero‑copy slices.
The product types to expose in Fabric (and how they map to foundational vs. derived)
Your taxonomy: Curated datasets, live event data with context, semantic models, reports, Data Agents, ML models. Foundational vs. derived describes how you compose them—not a separate type system.
Two near‑term previews you’ll want in the toolbox
- Materialized Lake Views: Define declarative SQL transformations with built‑in lineage, monitoring, and data quality rules—a clean way to publish a curated, consumption‑ready surface as a derived product without hand‑built pipelines. (MLVs materialize to Delta for performance; they reduce bespoke ETL, but they’re not a pure zero‑copy view.)
- Shortcut Transformations: Start from a Shortcut (for example, a CSV folder in S3/ADLS/OneLake), and let Fabric automatically convert/synchronize it into a Delta table—no pipelines to maintain. There are also AI Shortcut Transformations (preview) for text (summarization, sentiment, PII detection, entity extraction), producing governed Delta tables on a frequent sync. (These do create managed Delta outputs; your source remains authoritative.)
Composing products in Fabric (foundational → derived) without copies
Derived data product composition patterns slot neatly into Fabric’s controls:
- Restricted products (subsets of a foundation): Publish governed slices via OneLake security (table/folder + RLS/CLS) or semantic‑model RLS—no cloned tables. Use Catalog + labels for discoverability & protection.
- Composite products (join foundations predictably): Use Materialized Lake Views to declaratively assemble “Customer × Orders × Entitlements,” then expose via Direct Lake for consistent interactive use.
- Semantic/metric products: Centralize definitions in semantic models (Direct Lake), so reports/agents share the same metrics (“Active Student,” “Gross Revenue”)—not bespoke DAX per report.
- Inferred/feature products: Train/register models in Fabric (MLflow), score to Delta (PREDICT) on a schedule, and version the output as a product for BI and downstream ML reuse.
- Real‑time + context: Turn on OneLake availability on Eventhouse to expose event tables in Delta; then use MLVs to join with curated dimensions; consume via Direct Lake for a coherent “live metrics” product—still avoiding extra copies.
- Cross‑tenant sharing: For partner access, publish the same product by creating an External Data Share (consumer receives a Shortcut). No physical export, no shadow lakes.
Governance: making product contracts explicit and enforceable
- Catalog first – Use OneLake Catalog to register, describe, endorse, and govern your data products (Explore for discovery, Govern for posture insights and recommended remediations).
- Labels and domains – Apply Purview sensitivity labels (and domain‑level defaults where appropriate) and organize workspaces under Domains for federated ownership aligned to your org structure.
- Row/column/table controls at the lake – With OneLake security (preview), enforce RLS and CLS directly on Delta tables/folders; keep the foundation small and safe, then compose derived products via authorized interfaces. (OneLake security is replacing the older data‑access‑roles preview.)
- Semantics + access in one place: Keep metric logic in semantic models (Direct Lake) and access control (e.g., RLS) at that layer when the product is a reportable/metric surface.
- Lineage + quality: MLVs give lineage and data‑quality monitoring on declarative transforms; surface those signals in Catalog and Purview hub to close the loop with product SLOs.
Zero‑copy in Fabric: what it does and doesn’t mean
Zero‑copy patterns:
- Shortcuts to internal/external stores → engines query in place.
- External Data Sharing (cross‑tenant Shortcuts) → partners consume your data in your lake.
- Direct Lake semantic models → reports query Delta directly (no import cache).
- Eventhouse OneLake availability → stream tables exposed to OneLake Delta for BI/ML without extra storage charge.
Near-Zero Copy Patterns:
- Materialized Lake Views materialize curated tables for performance & monitoring (no custom pipelines, but there is a managed Delta output).
- Shortcut Transformations (and AI variants) synchronize from files referenced by a Shortcut into managed Delta tables (your source stays authoritative).
- Mirroring continuously replicates operational databases into OneLake Delta for analytics (low‑latency replica rather than virtualization).
Final thoughts
All of this makes Microsoft Fabric an ideal – and rapidly improving – place to build your data products. Each of these components, along with many that I’m sure are coming, points in one direction – Microsoft Fabric as one of the premier data management platforms, and a logic place to start looking if you’re planning on a data product-focused deployment, especially if zero-copy while maintaining speed and broad ingestion capability is something that interests you.