The Ideal Microsoft Fabric CI/CD Approach: Git for Change, Deployment Pipelines for Promotion, and a Code-First Escape Hatch

Microsoft Fabric CI/CD has a reputation for being confusing—usually because people look at Git integration and Deployment Pipelines as competing ideas rather than two halves of a single delivery story.

The good news is that the “ideal” approach is not exotic. It’s a handoff:

  • Use Git integration to support real developer workflows (including branching that maps cleanly to isolated workspaces).
  • Use Deployment Pipelines to promote approved changes across environments.
  • When you need richer approvals, tests, and release controls, let traditional tooling—especially GitHub Actions or Azure DevOps Pipeline—orchestrate promotions via Fabric APIs.

In this post, I’ll lay out that end-to-end pattern step-by-step, show where the seams belong, and call out the cost you can’t ignore: workspace sprawl—and the operational discipline required to manage aged workspaces intentionally.

Continue reading “The Ideal Microsoft Fabric CI/CD Approach: Git for Change, Deployment Pipelines for Promotion, and a Code-First Escape Hatch”

The NotebookUtils Gems I Wish More Fabric Notebooks Used

Most Fabric notebook code I review has the same telltale shape: a little Spark, a hardcoded path (or three), and just enough glue logic to “get it to run.” And then, a month later, someone copies it into another workspace and everything breaks.

NotebookUtils is one of the easiest ways to avoid that fate. It’s built into Fabric notebooks, it’s designed for the common “day two” problems (orchestration, configuration, identities, file movement), and it’s still surprisingly underused. NotebookUtils is also the successor to mssparkutils—backward compatible today, but clearly where Microsoft is investing going forward.

In this post, I’m going to do two things:

  • Give you a quick, practical orientation to NotebookUtils in Fabric.
  • Walk through the functions I reach for most often—especially the ones I don’t see enough in real projects: runtime.contextrunMultiple()/validateDAG()variableLibrary.getLibrary()fs.fastcp()fs.getMountPath()credentials.getToken(), and lakehouse.loadTable().

Along the way, I’ll call out a few patterns that make notebooks feel less like “scripts you run” and more like reusable components in Microsoft Fabric data engineering work.

Continue reading “The NotebookUtils Gems I Wish More Fabric Notebooks Used”

DirectLake Without OneLake Access: A Fixed-Identity Pattern That Keeps the Lakehouse Off-Limits

There’s a moment that catches a lot of Fabric teams off guard.

You publish a beautiful report on a DirectLake semantic model. Users can slice, filter, and explore exactly the way you intended. Then someone asks, “Why can I open the lakehouse and browse the tables?” Or worse: “Why can I query the SQL analytics endpoint directly?”

If your objective is semantic model consumption without lake access, the default DirectLake behavior can feel like it’s working against you. By default, DirectLake uses Microsoft Entra ID single sign-on (SSO)—meaning the viewer’s identity must be authorized on the underlying Fabric data source.

This post walks through a clean, operationally heavier—but very effective—pattern:

Bind the DirectLake semantic model to a shareable cloud connection with a fixed identity, and keep SSO disabled. Then do not grant end users any permissions on the lakehouse/warehouse item. Users can query the semantic model, but they can’t browse OneLake or query the data item directly.

Along the way, we’ll also cover the “gotchas” that trip teams up (especially around permissions and “SSO is still on somewhere”), plus a few guardrails that matter for real-world data governance in Microsoft Fabric.

Continue reading “DirectLake Without OneLake Access: A Fixed-Identity Pattern That Keeps the Lakehouse Off-Limits”

Workspace Sprawl Isn’t Your Fabric Problem—Stale Workspaces Are

“Do we really need another workspace?”

If you’ve built anything meaningful in Microsoft Fabric, you’ve heard some version of that question. It usually comes wrapped in a familiar anxiety: workspace sprawl. Too many containers. Too much to govern. Too hard to manage.

Here’s the reframing that matters: workspace count is rarely the risk. The real risk is stale workspaces and stale data—the forgotten corners of your tenant where ownership is unclear, permissions linger, and the platform quietly accumulates operational and compliance debt.

In this post I’ll walk through why “workspace sprawl” is a false fear, why workspaces naturally form clusters (and why good development multiplies them), and how intentional permissioning in Microsoft Entra and Fabric keeps management from becoming a linear slog—especially once you introduce automation and tooling. Along the way, I’ll ground the point in the real mechanics of Microsoft Fabric rather than vibes.

Continue reading “Workspace Sprawl Isn’t Your Fabric Problem—Stale Workspaces Are”

Fabric Environments Feel Like a Turbo Button—Until Private Link Gets Involved

If you’ve spent any real time in notebooks, you’ve felt it: the “why am I doing this again?” moment. You start a session, install the same libraries, chase a version mismatch, restart a kernel, and finally get back to what you actually came to Fabric to do.

Microsoft Fabric Environments are a strong answer to that pain. They pull your Spark runtime choice, compute settings, and library dependencies into a reusable, shareable artifact you can attach to notebooks and Spark Job Definitions. And with the latest previews—Azure Artifact Feed support inside Environments and Fabric Runtime 2.0 Experimental—it’s clear Microsoft is investing in making Spark development in Microsoft Fabric more repeatable and more “team ready.”

There’s a catch, though: once you introduce Private Link (and the networking controls that tend to come with it), some of the most convenient paths close off. So the story becomes less about “click to go faster” and more about “choose your trade-offs intentionally.”

In this post, I’ll cover what Fabric Environments are, what’s new in the previews (Artifact Feeds + Runtime 2.0), why Environments speed up real work, and where Private Link limits your options—and what you can do about it.

Continue reading “Fabric Environments Feel Like a Turbo Button—Until Private Link Gets Involved”

The Hidden Permission Chain Behind Cross-Workspace Lakehouse Shortcuts (for Semantic Models)

One of the cleanest patterns in Microsoft Fabric is splitting your world in two: a “data product” workspace that owns curated lakehouses, and an “analytics” workspace that owns semantic models and reports. You connect the two with a OneLake shortcut, and suddenly you’ve avoided copies, reduced refresh complexity, and kept your architecture tidy.

Then the first DirectLake semantic model hits that shortcut and… the tables don’t load.

This post walks through what’s really happening in that moment in Microsoft Fabric, what permissions you actually need (and where), and how to tighten the whole pattern with OneLake Security instead of simply widening access. We’ll also cover the easy-to-miss caveat: if your shortcut ultimately lands on a Fabric SQL Database, you still have to do SQL permissions, too.

Continue reading “The Hidden Permission Chain Behind Cross-Workspace Lakehouse Shortcuts (for Semantic Models)”

Stop Paying Hot-Tier Prices for Cold Data: Using ADLS Gen2 to Tame Fabric Ingestion Storage Costs

If you’ve been living in Microsoft Fabric for a few months, you’ve probably felt it: the platform makes it incredibly easy to ingest data… and surprisingly easy to rack up storage spend while you’re doing it (especially considering how much storage is included).

The pattern is common. A team starts with a Lakehouse, adds Pipelines or Dataflows Gen2 for ingestion, follows a sensible medallion approach, and before long they’re keeping “just in case” raw files, repeated snapshots, and long-running history inside OneLake—often at the same performance tier as yesterday’s data. The storage bill grows quietly. Capacity pressure shows up in places you didn’t expect. And suddenly “simple ingestion” is a FinOps conversation.

Here’s the good news: you don’t have to choose between Fabric and sensible archival strategy. Azure Data Lake Storage Gen2 (ADLS Gen2) can be your pressure relief valve—your durable landing zone and archive—while Fabric stays the place you compute, curate, model, and serve.

What follows is a deep dive into how to use ADLS Gen2 accounts to solve the archival and storage-cost traps that show up during Fabric ingestion: where the costs come from, what architectural patterns work well, and the practical implementation details (shortcuts, security, and billing mechanics) that make it real for Microsoft Fabric teams.

Continue reading “Stop Paying Hot-Tier Prices for Cold Data: Using ADLS Gen2 to Tame Fabric Ingestion Storage Costs”

Data Quality as Code in Fabric: Declarative Checks on Materialized Lake Views

If you’ve ever shipped a “clean” silver or gold table only to discover (later) that it quietly included null keys, impossible dates, or negative quantities… you already know the real pain of data quality.

The frustration isn’t that bad data exists. The frustration is that quality rules often live somewhere else: in a notebook cell, in a pipeline activity, in a dashboard someone checks (sometimes), or in tribal knowledge that never quite becomes a contract.

Microsoft Fabric’s Materialized Lake Views (MLVs) give you a more disciplined option: you can define declarative data quality checks inside the MLV definition using constraints, and then use Fabric’s built-in monitoring, lineage, and embedded Power BI Data Quality reports to understand how quality is trending across your lakehouse and your data products.

In this post, I’ll cover what these checks look like, how to add them, and—most importantly—how to turn them into quality signals you can operationalize for a Microsoft Fabric lakehouse and the Data Engineering teams who depend on it.

It’s important to note, here, that we’re looking at structural data quality here. Data Integrity – making sure that your data is following your business logic, makes sense, and isn’t drifting, is another discipline, and while these techniques can be adapted for it, there’s other ways to implement that that are more efficient.

Continue reading “Data Quality as Code in Fabric: Declarative Checks on Materialized Lake Views”

The Advanced Lakehouse Data Product: Shortcuts In, Materialized Views Through, Versioned Schemas Out

There’s a familiar tension in modern analytics: teams want data products that are easy to discover and safe to consume, but they also want to move fast—often faster than the governance model can tolerate.

In Microsoft Fabric, that tension frequently shows up as a perception of workspace sprawl. A “single product per workspace” model is clean on paper—strong boundaries, tidy ownership, straightforward promotion—but it can quickly turn into dozens (or hundreds) of workspaces to curate, secure, and operate.

This post proposes a different pattern—an advanced lakehouse approach that treats the lakehouse itself like a product factory:

  • Shortcuts or schema shortcuts become the input layer (a clean, contract-aware “ingest without copying” boundary).
  • small-step transformation layer is implemented as a multi-step DAG using Materialized Lake Views (MLVs).
  • versioned, schema-based surface area becomes the data product contract you expose to consumers.

Then we connect that to OneLake security and Fabric domains, showing how you can expose left-shifted data products (usable earlier in the lifecycle) without letting workspaces multiply endlessly.

Continue reading “The Advanced Lakehouse Data Product: Shortcuts In, Materialized Views Through, Versioned Schemas Out”

Freeze-and-Squash: Turning Snapshot Tables into a Versioned Change Feed with Fabric Materialized Lake Views

Periodic snapshots are a gift and a curse.

They’re a gift because they’re easy to land: each load is a complete “as-of” picture, and ingestion rarely needs fancy orchestration. They’re a curse because the moment you want history with meaning—a clean versioned change feed, a Type 2 dimension, a Data Vault satellite—you’re suddenly writing heavy window logic, MERGEs, and stateful pipelines that are harder to reason about than the business problem you were trying to solve.

This post describes a Fabric Materialized Lake View (MLV) pattern that “squashes” a rolling set of snapshot tables down into a bounded, versioned change feed by pairing a chain of MLVs with a periodically refreshed frozen table. We’ll walk the pattern end-to-end, call out where it shines (and where it doesn’t), and then show how the resulting change feed can be used to support both #SlowlyChangingDimensions and #DataVault processes in an MLV-forward #MicrosoftFabric lakehouse architecture.

Before we go too far: the gold standard is still getting a change feed directly from the source system (CDC logs, transactional events, source-managed “effective dating,” or an authoritative change table). When you can get that, take it. Everything else—including this pattern—is a disciplined way of making the best of snapshots.

Continue reading “Freeze-and-Squash: Turning Snapshot Tables into a Versioned Change Feed with Fabric Materialized Lake Views”