Architecture Wednesday – EduDataSci – Educating the world about data and leadership

Making Schema Change Boring: A Short History—and How Microsoft Fabric’s Medallion Lakehouse Bakes It In

Schema changes have always been risky because a schema isn’t just columns—it’s the interface between data producers and data consumers. Historically, that interface was rigid, which made any change expensive. Modern lakehouse design solves the problem structurally: a Medallion architecture separates where variation is tolerated (Bronze) from where commitment is made (Silver) and relied upon (Gold). In Microsoft Fabric, those roles map cleanly to Lakehouse, Warehouse, and Power BI’s semantic layer, with governance and domain‑oriented (data‑product) design tying it all together. By the end, you’ll see why schema evolution is both inevitable and manageable—and how Fabric builds that manageability into the platform.

Data Vault, Practically: Why It Exists, How It’s Built, and What 2.1 Changes

Modern data platforms live in tension:

Source systems evolve faster than dimensional models can absorb.
Audit and lineage are mandatory, but teams still need velocity.
Cloud lakehouses, streaming, and domain ownership do not slot neatly into yesterday’s warehouse playbooks.

Data Vault is a response to those pressures. It is both a modeling approach and a delivery method designed to (1) absorb change, (2) preserve complete, immutable history, and (3) decouple integration from consumption. The core building blocks—Hubs, Links, and Satellites—organize into a Raw Vault (source truth, append‑only) and a Business Vault(governed derivations and query assistance). Think of it as a fault‑tolerant integration substrate with a clean seam to marts, semantic models, and data products.

Why Star Schemas Make Analysts Faster (and Happier)

If you live in spreadsheets or SQL all day, the “one big table” (OBT) feels like home. Everything you need is right there: one row per thing, a column for every attribute, and no joins to worry about. It’s a great way to explore data fast—until it isn’t. This post explains, in plain language, why the star schema pays you back every day you analyze data, and how it keeps the speed you love without the headaches you’ve learned to live with.

Baselines Over Buzzwords: From Warehouse to Lakehouse

If you’ve built data systems long enough, you’ve lived through at least three architectural moods: the tidy certainty of Kimball and Inmon, the anarchic freedom of “throw everything in the data lake to ingest quickly,” and today’s lakehouse, which tries to keep our speed without losing our sanity. I’ve always cared less about labels and more about baselines—clear, durable expectations that make change safe. This piece traces how those baselines shifted, what we gained and lost, and how to rebuild them for modern work, including real‑time, very large, and unstructured data.

Slowly Changing Dimensions (SCDs): A Practical Guide for Your Star Schema

Star schemas shine when your facts (events) are analyzed through dimensions (who/what/where/when). But in real life, dimension attributes change—customers move, products rebrand, sales territories realign. Slowly changing dimensions (SCDs) are the modeling patterns that preserve analytic correctness as those attributes evolve.

A Practical Introduction to Star Schema Data Architecture

Dimensional modeling remains the most effective way to make analytics fast, understandable, and resilient. The star schema sits at the center of that approach: a simple, denormalized structure where fact tables record measurable events and dimension tables provide descriptive context. In this post, we’ll ground the core ideas, clarify the often‑confused concept of snowflaking (and when it’s worth it), and show how to scale from a single star to a galaxy schema (a.k.a. fact constellation) without losing your footing.