ETL

Temp Tables in SSIS

Temp tables are very handy when you have the need to store and manipulate an interim result set during ETL or other data processing operations. However, if you use SQL Server Integration Services as your ETL tool, you may find some challenges when trying to work with temp tables in SSIS packages, especially in the SSIS data flow. In this…


Managing Business Logic

Encapsulating business logic into data movement and presentation is a critical part of a stable information management strategy. Too often, though, business logic is built and added late in the process, forcing it into whatever nooks and crannies are available. While this duct-tape approach sometimes works, it makes the resulting system difficult to maintain when the business logic is spread…


Using ETL Staging Tables

Most traditional ETL processes perform their loads using three distinct and serial processes: extraction, followed by transformation, and finally a load to the destination. However, for some large or complex loads, using ETL staging tables can make for better performance and less complexity. As part of my continuing series on ETL Best Practices, in this post I will some advice…


SSIS Catalog Logging Tables

Making the most of the SSIS catalog requires an understanding of how to access the information stored in the logging tables. Although there are built-in reports to show this information, there are limitations in their use. Fortunately, the logging tables in the SSIS catalog database are (mostly) straightforward and easy to understand once you’ve worked with them a bit. In…


Retrieve a List of Files from FTP using SSIS

The FTP protocol is one of the oldest methods for sharing and moving files. Although frequently considered to be an “old-school” way to transfer data, FTP is still relied upon in most every data movement architecture. Sadly, the functionality around FTP is very limited in SQL Server Integration Services. A common project requirement is to retrieve a list of files from…


What is ETL?

In my ongoing series on ETL Best Practices, I am illustrating a collection of extract-transform-load design patterns that have proven to be highly effective. In the interest of comprehensive coverage on the topic, I am adding to the list an introductory prequel to address the fundamental question: What is ETL? What Is ETL? ETL is shorthand for the extraction, transformation,…


ETL Error Handling

In designing a proper ETL architecture, there are two key questions that must be answered. The first is, “What should this process do?” Defining the data start and end points, transformations, filtering, and other steps must be done before any other work can proceed. The second question that must be answered is “What should happen when the process fails?” Too…


SSIS Data Taps

One of my favorite testing features of SSIS is also one of the most underutilized. SSIS data taps were introduced with the SSIS catalog in SQL Server 2012 as a way to capture data within one leg of a data flow task and write it out to a file for testing or auditing purposes. In this brief post, I’ll show…


A Shortcut for Parameterizing Settings in SSIS

I’ve written quite a bit about the benefit of externalizing changing values in SSIS packages. Moving static values such as connection strings and file paths to a configurable input makes easier the tasks of testing, changing, and auditing the process in the future. The short and generic story here is: don’t hard code values that can change. In SQL Server…


A Better Way to Execute SSIS Packages with T-SQL

There are several ways to execute SSIS packages that have been deployed to the SSIS catalog, and my favorite way of performing this task is to execute SSIS packages with T-SQL. There are a couple of challenges that come up when executing catalog-deployed packages using the default settings in T-SQL, but I have workaround for those issues which I’ll cover…