Slowly Changing Dimensions

Slowly changing dimensions, what a confusing name. This is the sort of term I’ve been hearing for a while now, but never really bothered to properly look up what it actually means. As it turns out, all it means is how you handle row updates/deletes in your data pipelines. Whether you decide to maintain historical data in your tables by appending new rows when a change happens, or you decide to update in place instead. ...

October 30, 2025 · 5 min

Building a Reverse ETL Pipeline: Upserting Delta Lake Data into Postgres with Structured Streaming

In this post, I share how to build a Reverse ETL pipeline to upsert data from Delta (Databricks) into Postgres to provide sub-second response times to our tables. The goal is to make warehouse data available to downstream systems that require millisecond response times. These systems could be front-end applications that need to consume this data, or online machine learning models that require additional data as input to generate predictions in real time. ...

May 8, 2025 · 9 min