Tim Mitchell
Follow Tim Mitchell on Twitter  Subscribe on YouTube  Connect on LinkedIn  Subscribe to the Data Geek Newsletter

The SSIS Catalog

SSIS CatalogThe SSIS catalog is a system through which SQL Server Integration Services (SSIS) packages are stored, executed, and logged. Introduced with the release of SQL Server 2016, the SSIS catalog helps to better adapt Integration Services as an enterprise-ready ETL system.

In this post, I’ll share a brief introduction to the SSIS catalog, as well as links to content on this site and elsewhere with more detailed information.

The SSIS Catalog: A brief introduction

The SSIS catalog was built to be the center of the universe when it comes to SSIS packages. It was designed to serve as the deployment target for storing packages, eliminating the need to store packages in the MSDB database. Packages deployed to the SSIS catalog were also executed using built-in logic in catalog stored procedures, allowing for easy execution of packages via T-SQL. It also simplified the logging process, enabling a simple runtime setting to define how packages are logged. Finally, it improved security by allowing more granular controls over who would do what to which packages.

The catalog data resides in a database named SSISDB which stores the projects, configuration values, and logging information. This database resides on an instance of Microsoft SQL Server, and requires at least Standard edition of that database engine. The catalog includes an execution runtime that handles the execution and management. Also included is a set of basic reports for a quick glimpse into recent catalog activity.

Getting Started

Getting started with the SSIS catalog is relatively easy. I’ve published a brief post on how to create the catalog database. For more information on how to get started using the catalog, visit Microsoft’s documentation on the topic.

Using the Catalog

There are several ways to execute packages stored in the catalog. SQL Server Agent jobs, T-SQL scripts, and the command line app dtexec.exe can all be used to invoke packages. I recommend SQL Server Agent for scheduled jobs, and using T-SQL commands for ad-hoc executions. I’ve also shared a script I wrote showing some modifications to the default way of executing packages via T-SQL.

You can always remove packages from the catalog as well. However, removing a single package isn’t a simple operation, which I describe in this brief post.

The SSIS catalog has its own metadata validation tools as well. You can run a package validation prior to (or instead of) execution to verify some of its underlying metadata.

Logging

One of the most useful features of the SSIS catalog is the built-in system for activity logging. In earlier versions, configuring and monitoring logs was a cumbersome task. The catalog introduced a new set of standardized logging tables that eliminate a lot of the manual work and ambiguity from execution logging.

The scope and amount of data logged is defined by the selected logging level. There are five broad predefined logging levels to allow you to set the amount of data logged per execution. If these logging levels do not offer enough flexibility, you can create your own custom logging levels.

There is also an easy way to track the runtime parameters used by the package. This information is logged by default on every package execution, and is very useful for troubleshooting and auditing.

Execution Reports

Although the built-in reports are useful, they have their limitations. The reports are only available from within SQL Server Management Studio (SSMS), and you’re limited to the data as presented in these canned, read-only reports. To get around this, I have published a free set of dashboard reports for the SSIS catalog that can be modified as needed to monitor the SSIS catalog. These reports can be deployed to SQL Server Reporting Services, eliminating the need to run the reports solely from within SSMS.

Of course, you can create your own custom reports from the SSISDB database as well. If you do, you’ll likely need to add some “helper” tables, such as the execution status list script I created.

If you want to log the data (not just the events and messages) for an execution, check into data taps. A data tap allows you to write data flow data out to a CSV for review or auditing.

Maintenance and Administration

When you deploy a new version of a project to the SSIS catalog, the old version is retained in an archived state. This allows some visibility into prior versions for auditing or testing purposes, or to quickly recover from an accidental deployment. By default, up to 10 versions of each project will be retained in archive. This post goes into greater detail on project versioning.

Those logging tables I mentioned earlier can get quite large over time. To avoid log table bloat, there is a process that purges log data older than a certain age (by default, 365 days). This process, explored in more detail here, runs each day at midnight. While this process is useful, it can perform poorly on large log tables. I’ve written a custom script that handles this purge process in a more efficient way.

Next Steps

For a more hands-on explanation of the SSIS catalog and its moving parts, watch my video introduction on this topic.