Creating Your First Azure Data Factory

Azure Data Factory has grown in both popularity and utility in the past several years. It has evolved beyond its significant limitations in its initial version, and is quickly rising as a strong enterprise-capable ETL tool. In my last post on this topic, I shared my comparison between SQL Server Integration Services and ADF. In this post, I’ll walk through…


My New Favorite Demo Dataset: Dunder Mifflin Data

Those of us who write technical articles and deliver technical presentations are always on the lookout for the perfect data set for demonstration and testing. Microsoft has done a good job of putting together sample databases including Wide World Importers, AdventureWorks, and Northwind Traders. Personally, I’ve found each of these useful, but I had no particular attachment to any of…


Reusing a Recordset in an SSIS Object Variable

A few years back, I wrote a blog post about using an SSIS object variable as a data flow source. In that post, I described how you could load a set of query results into an object-typed variable in SQL Server Integration Services and then use that in-memory data as a source within a data flow. In the comments and…


The What, Why, When, and How of Incremental Loads

When moving data in an extraction, transformation, and loading (ETL) process, the most efficient design pattern is to touch only the data you must, copying just the data that was newly added or modified since the last load was run. This pattern of incremental loads usually presents the least amount of risk, takes less time to run, and preserves the…


Comparing SSIS and Azure Data Factory

For the better part of 15 years, SQL Server Integration Services (SSIS) has been the go-to enterprise extract-transform-load (ETL) tool for shops running on Microsoft SQL Server. More recently, Microsoft added Azure Data Factory (ADF) to its stable of enterprise ETL tools. In this post, I’ll be comparing SSIS and Azure Data Factory to share how they are alike and…


The SSIS Catalog: Install, Manage, Secure, and Monitor your Enterprise ETL Infrastructure

I’m happy to announce the publication of my latest book. The SSIS Catalog: Install, Manage, Secure, and Monitor your Enterprise ETL Infrastructure is now available on Amazon in both paperback and Kindle formats. In this book, I introduce the reader to the SSIS catalog and describe how it fits into an enterprise ETL architecture. This book is, by design, narrowly…


SSIS Parameters

SSIS parameters help to ease the process of passing runtime values to SSIS packages. For values that can change over time, using parameters in SSIS is an ideal means of externalizing those runtime values. In this post, I’ll demonstrate how to get started using SSIS parameters. What Problem Are We Solving? To understand the need for ETL parameters, let’s first…


Secure Your Data Prep Area

I’ve been building ETL processes for many years, and I’ve learned two universal truths: data preparation is messy, and you must always secure your data prep area. A data prep area is very similar to a commercial kitchen, and in the same way that customers aren’t allowed in the kitchen, so should data consumers be kept from accessing in-process data…


Lessons from Iowa’s Caucus Debacle

Earlier this week, the state of Iowa held its caucuses to choose each political party’s nominee for November’s presidential election. Being the first state in each election cycle to do this, the electorate of Iowa has the attention of the nation; a candidate’s healthy or anemic showing in that state often foretells his or her success for the rest of…


Using the JOIN Function in Reporting Services

The JOIN() function in SQL Server Reporting Services is a handy tool that allows you to turn a list into a delimited string value. This function accepts two parameters, a list and a delimiter with which to separate the output, and returns a string with that list separated by the specified delimiter character. Using the JOIN Function in Reporting Services…